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IMAGING, DIAGNOSIS AND TREATMENT OF DISEASE 

The present invention relates to genes whose expression is selective for the 
endothelium and use of these genes or gene products, or molecules which bind 
5 thereto, in imaging, diagnosis and treatment of conditions involving the 
vascular endothelium. 

The endothelium plays a central role in many physiological and pathological 
processes and it is known to be an exceptionally active transcriptional site. 

10 Approximately 1,000 distinct genes are expressed in an endothelial cell. In 
contrast red blood cells were found to express 8, platelets 22 and smooth 
muscle 127 separate genes (Adams et al, 1995). Known endothelial specific 
genes attract much attention from both basic research and the clinical 
community. For example, the endothelial specific tyrosine kinases Tie, 

15 TIE2/TEK, KDR, and fltl are crucial players in the regulation of vascular 
integrity, endothelium-mediated inflammatory processes and angiogenesis 
(Sato et al 9 1993, Sato et al, 1995, Fong et al, 1995, Shalaby et al, 1995, Alello 
et al, 1995). Angiogenesis is now widely recognised as a rate-limiting process 
for the growth of solid tumours. It is also implicated in the formation of 

20 atherosclerotic plaques and restenosis. Finally endothelium plays a central role 
in the complex and dynamic system regulating coagulation and hemostasis. 

Of the many distinct genes expressed in an endothelial cell, not all are entirely 
endothelial cell selective and so the genes and their products, and molecules 
25 which bind thereto are not generally useful in the imaging, diagnosis and 
treatment of disease. Thus, there remains a need for endothelial cell specific or 
selective molecules. 
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We report here identification of two highly endothelial selective genes which 
we have called: endothelial cell-specific molecule 1 (ECSM1) and magic 
roundahout (endothelial cell-specific molecule 4; ECSM4). The terms ECSM1 
5 and ECSM4 are also used to indicate, as the context will make clear, the cDNA 
and polypeptides encoded by the genes. These genes, and especially ECSM4, 
are surprisingly specific in their cell expression profile. ECSM4, for example, 
shows similar endothelial-cell selectivity to the marker currently accepted in 
the art as the best endothelial cell marker (von Willibrand Factor). Clearly, 
10 such a high level of endothelial cell specificity is both unprecedented and 
unexpected. 

ECSM1 (UniGene entry Hs. 13957) has no protein or nucleotide homologues. 
It is most likely to code for a small protein of 103 aa (the longest and most up- 

15 stream open reading frame which was identified in the contig sequence). 
ECSM1 contains two sequence tagged sites which are unique and definite 
within the genome (STS sites; dbSTS G26129 and G28043) and localise to 
chromosome 19. A polynucleotide comprising the complement of part of the 
ECSM1 gene is described in WO 99/06423 (Human Genome Sciences) 

20 (termed "gene 22"; page 31-32) as being expressed primarily in umbilical cord 
endothelial cells and to a lesser extent in human adipose tissue. However, WO 
99/06423 discloses an open reading frame (ORF) in the polynucleotide which 
encodes a polypeptide of only 45 amino acids. According to our analyses, this 
does not represent the correct polypeptide of 103 amino acids, as the actual 

25 start codon in ECSM1 is further 5' than the one identified in WO 99/06423. 
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The human magic roundabout (ECSM4) cDNA clone with a long ORF of more 
than 417 aa (GenBank Accession No AK000805) and described in 
WO 99/46281 as a 3716 nucleotide sequence was identified by BLAST 
searches for the Hs.l 11518 contig. This sequence is rich in prolines and has 
5 several regions of low amino, acid complexity. BLAST PRODOM search 
(protein families database at HGMP, UK) identified a 120 bp region of 
homology to the cytoplasmic domain conserved family of transmembrane 
receptors involved in repulsive axon guidance (ROBOl DUTT1 protein family, 
E=4e-07). Homology was extended to 468 aa (E— 1.3e-09) when a more 

10 rigorous analysis was performed using ssearch (Smith and Waterman 1981) but 
the region of similarity was still contained to the cytoplasmic domain. The 
ROBOl DUTT1 family comprises the human roundabout homologue 1 
(ROBOl), the mouse gene DUTT1 and the rat ROBOl (Kidd et al y 1998, 
Brose et al, 1999). Because of this region of homology we called the gene 

15 represented by Hs. 111518 "magic roundabout" (ECSM4). Additionally, 
BLAST SBASE (protein domain database at HGMP) suggested a region of 
similarity to the domain of the intracellular neural cell adhesion molecule long 
domain form precursor (E=2e-ll). It should be noted that the true protein 
product for magic roundabout is likely to be larger than the 417 aa coded in the 

20 AK000805 clone since the ORF has no apparent up-stream limit, and size 
comparison to human roundabout 1 (1651 aa) suggests a much bigger protein. 
This is confirmed in Figure 3 which shows the translation product of human 
ECSM4 to be around 118kDa. However, ECSM4 is smaller than other 
members of the roundabout family, sharing only two of the five Ig domains and 

25 two of the three fibronectin domains in the extracellular region. The 
intracellular putative proline rich region that is homologous to those in 
roundabout are thought to couple to c-abl. Figure 12 shows the full length 
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amino acid sequence of human ECSM4 (1105aa), and the sequence of the 
mouse homologue is shown in Figure 13. Nucleotide coding sequences which 
display around 99% identity to the ECSM4 nucleotide sequence given in 
Figure 12 are disclosed in WO 99/1 1293 and WO 99/5305 1 . 

5 

Additional sequences which display homology to the ECSM4 polypeptide or 
polynucleotide sequence are disclosed in EP 1 074 617, WO 00/53756, WO 
99/46281, WO 01/23523 and WO 99/11293. However, none of these 
publications disclose that the sequences are selectively expressed in the 
10 vascular endothelium, nor suggest that they may be so expressed. 

Recently intriguing associations between neuronal differentiation genes and 
endothelial cells have been discovered. For example, a neuronal receptor for 
vascular endothelial growth factor (VEGF) neuropilin 1 (Soker et al, 1998) was 

15 identified. VEGF was traditionally regarded as an exclusively endothelial 
growth factor. Processes similar to neuronal axon guidance are now being 
implicated in guiding migration of endothelial cells during angiogenic capillary 
sprouting. Thus ephrinB ligands and EphB receptors are involved in 
demarcation of arterial and venous domains (Adams et al, 1999). It is possible 

20 that magic roundabout (ECSM4) may be an endothelial specific homologue of 
the human roundabout 1 involved in endothelial cell repulsive guidance, 
presumably with a different ligand since similarity is contained within the 
cytoplasmic i.e. effector region and guidance receptors are known to have 
highly modular architecture (Bashaw and Goodman 1999). 

25 

However, to date there has been no mention of the existence of an endothelial 
counterpart, nor the expression pattern of the magic roundabout (ECSM4) gene 



WO 02/36771 




PCT/GB01/04906 



5 

being restricted to endothelial cells especially angiogeneic endothelial cells, 
nor of any function of the encoded polypeptide. 

It should be noted that a surprising result of our RT-PCR analysis, described in 
Example 1, was that genes identified here appear to show endothelial 
specificity (Fig. 1) comparable with the classic endothelial marker von 
Willebrand factor (vWF). Expression of known endothelial specific genes is 
not usually 100% restricted to the endothelial cell. Data presented herein 
shows the quite unanticipated finding that ECSM4 is not expressed at 
detectable levels (at least using the methods described in the examples) in cell 
types other than endothelial cells, given the less than 100% selectivity of 
known endothelial cell markers. Ribonuclease protection analysis has 
confirmed and extended this observation (Figure 14a). ECSM4 expression was 
seen to be restricted to endothelium (three different isolates) and absent from 
fibroblast, carcinoma and neuronal cells. KDR and FLT1 are both expressed in 
the male and female reproductive tract: on spermatogenic cells (Obermair et al, 
1999), trophoblasts, and in decidua (Clark et al, 1996). KDR has been shown 
to define haematopoietic stem cells (Ziegler et al, 1999). FLT1 is also present 
on monocytes. In addition to endothelial cells vWF is strongly expressed in 
megakaryocytes (Sporn et al, 1985, Nichols et al, 1985), and in consequence 
present on platelets. Similarly, multimerin is present both in endothelial cells 
(Hayward et al, 1993) and platelets (Hayward et al, 1998). 

Generally speaking, endothelial and haematopoietic cells descend from same 
embryonic precursors: haemangioblasts and many cellular markers are shared 
between these two cell lineages (for review see Suda et al, 2000). Hence, the 
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finding that the genes ECSM1 and ECSM4 are not expressed in cells other than 
those of the vascular endothelium is highly surprising. 

Determination of genes whose expression is selective for the vascular 
5 endothelium allows selective targeting to these cells and thereby the specific 
delivery of molecules for imaging, diagnosis, prognosis, treatment, prevention 
and evaluation of therapies for conditions associated with normal or aberrant 
vascular growth. 

10 A first aspect of the invention provides a compound comprising (i) a moiety 
which selectively binds the polypeptide ECSM4 and (ii) a further moiety. 

By "the polypeptide, ECSM4" we include a polypeptide whose sequence 
comprises or consists of the amino acid sequence given in Figure 4 or 5 or 7 or 

15 12 or 13 or whose sequence is encoded by the nucleotide sequence given in 
Figure 4 between nucleotides 1 and 1395 or between nucleotides 2 and 948 of 
Figure 5 or Figure 7 or between nucleotides 71 and 3442 of Figure 12 or 
between nucleotides 6 and 3050 of Figure 13 and natural variants thereof. 
Preferably, the ECSM4 polypeptide is one whose amino acid sequence 

20 comprises the sequence given in Figure 4 or Figure 12. 

By "the polypeptide ECSM4" we include a polypeptide represented by SEQ ID 
No 18085 of EP 1 074 617, SEQ ID No 211 of either WO 00/53756 or 
W099/46281, SEQ ID Nos 24-27, 29, 30, 33, 34, 38 or 39 of WO 01/23523, or 
25 SEQ ID No 86 of WO 99/11293, or the polypeptide represented by SEQ ID No 
18084 or 5096 of EP 1 074 617, SEQ ID No 210 of WO 00/53756 or 
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WO 99/46281, or SEQ ID Nos 22, 23, 96 or 98 of WO 01/23523 or SEQ ID 
No 31 of WO 99/1 1293. 

By "the polypeptide ECSM4" we also include any naturally occurring 
5 polypeptide which comprises a consecutive 50 amino acid residue portion or 
natural variants thereof of the polypeptide sequence given in Figure 4 or 5 or 7 
or 12 or 13. Preferably, the polypeptide is a human polypeptide. 

Embodiments and features of this aspect of the invention are as described in 
10 more detail below. 

A second aspect of the invention provides a compound comprising (i) a moiety 
which selectively binds the polypeptide ECSM1 and (ii) a further moiety. 

15 Preferably, in the first and second aspects of the invention, the binding moiety 
and further moiety are covalently attached. 



By "the polypeptide ECSM1" we include a polypeptide whose amino acid 
sequence comprises or consists of the sequence given in Figure 2 and natural 
20 variants thereof. 



By "the polypeptide ECSM1" we also include any naturally occurring 
polypeptides which comprises a consecutive 50 amino acid residue portion or 
natural variants thereof of the polypeptide sequence given in Figure 2. 
25 Preferably, the polypeptide is a human polypeptide. 
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Preferably, the polypeptide ECSM1 amino acid sequence comprises the 
sequence given in Figure 2 but does not comprise the amino acid sequence 
encoded by ATCC deposit No 209145 made on July 17 1997 for the purposes 
of WO 99/06423. 

5 

By "natural variants" we include, for example, allelic variants. Typically, 
these will vary from the given sequence by only one or two or three, and 
typically no more than 10 or 20 amino acid residues. Typically, the variants 
have conservative substitutions. 

10 

In a preferred embodiment of the first or second aspects of the invention, the 
moiety capable of selectively binding to the specified polypeptide is an 
antibody. 

15 Preferably, an antibody which selectively binds ECSM1 or a natural variant 
thereof is not one which binds a polypeptide encoded by SEQ ID No 32 of WO 
99/06423 or encoded by the nucleic acid of ATCC deposit No 209145 made on 
July 17 1997 for the purposes of WO 99/06423. 

20 Preferably, an antibody which selectively binds ECSM1 is one which binds a 
polypeptide whose amino acid sequence comprises the sequence given in 
Figure 2 or a natural variant thereof but which polypeptide does not comprise 
the amino acid sequence encoded by ATCC deposit No 209145 made on July 
17 1997. 

25 

Preferably, an antibody which selectively binds ECSM4 is one which 
selectively binds a polypeptide with the sequence GGDSLLGGRGSL, 
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LLQPPARGHAHDGQALSTDL, EPQDYTEPVE, TAPGGQGAPWAEE or 
ERATQEPSEHGP or a sequence which is located in the extracellular portion 
of ECSM4. As described in more detail below, these sequences represent 
amino acid sequences which are only found in the human ECSM4 and are not 
. 5 found in the mouse ECSM4 polypeptide sequence. 

Preferably, the moiety which selectively binds ECSM4, such as an antibody, 
is one which binds a polypeptide whose amino acid sequence comprises the 
sequence given in any one of Figures 4, 5, 7, 12 or 13 or a natural variant 

10 thereof but does not bind the polypeptide represented by any one of SEQ ID 
No 18085 of EP 1 074 617, SEQ ID No 211 of either WO 00/53756 or 
W099/46281, SEQ ID Nos 24-27, 29, 30, 33, 34, 38 or 39 of 
WO 01/23523, or SEQ ID No 86 of WO 99/11293, or encoded by any one 
of the nucleotide sequences represented by SEQ ID No 18084 or 5096 of 

15 EP 1 074 617, SEQ ID No 210 of WO 00 53756 or WO 99/46281, or SEQ 
ID Nos 22, 23, 96 or 98 of WO 01/23523 and SEQ ID No 31 of 
WO 99/11293. 

By "antibody" we include not only whole immunoglobulin molecules but also 
20 fragments thereof such as Fab, F(ab')2, Fv and other fragments thereof that 
retain the antigen-binding site. Similarly the term "antibody" includes 
genetically engineered derivatives of antibodies such as single chain Fv 
molecules (scFv) and domain antibodies (dAbs). The term also includes 
antibody-like molecules which may be produced using phage-display 
25 techniques or other random selection techniques for molecules which bind to 
ECSM1 orECSM4. 
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The variable heavy (Vr) and variable light (V L ) domains of the antibody are 
involved in antigen recognition, a fact first recognised by early protease digestion 
experiments. Further confirmation was found by "humanisation" of rodent 
antibodies. Variable domains of rodent origin may be fused to constant domains 
5 of human origin such that the resultant antibody retains the antigenic specificity 
of the rodent parented antibody (Morrison et al (1984) Proc. Natl Acad. Set USA 
81, 6851-6855). 

That antigenic specificity is conferred by variable domains and is independent of 
10 the constant domains is known from experiments involving the bacterial 
expression of antibody fragments, all containing one or more variable domains. 
These molecules include Fab-like molecules (Better et al (1988) Science 240, 
1041); Fv molecules (Skerra et al (1988) Science 240, 1038); single-chain Fv 
(ScFv) molecules where the V H and V L partner domains are linked via a flexible 
15 oligopeptide (Bird et al (1988) Science 242, 423; Huston et al (1988) Proc. Natl 
Acad. Set USA 85, 5879) and single domain antibodies (dAbs) comprising 
isolated V domains (Ward et al (1989) Nature 341, 544). A general review of the 
techniques involved in the synthesis of antibody fragments which retain their 
specific binding sites is to be found in Winter & Milstein (1991) Nature 349, 
20 293-299. . 

By "ScFv molecules" we mean molecules wherein the V H and V L partner 
domains are linked via a flexible oligopeptide. 

25 The advantages of using antibody fragments, rather than whole antibodies, are 
several-fold. The smaller size of the fragments may lead to improved 
pharmacological properties, such as better penetration to the target site. Effector 
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functions of whole antibodies, such as complement binding, are removed. Fab, 
Fv, ScFv and dAb antibody fragments can all be expressed in and secreted from 
E. coli, thus allowing the facile production of large amounts of the said 
fragments. 

5 

Whole antibodies, and F(ab ! ) 2 fragments are '"bivalent* \ By "bivalent* 9 we mean 
that the said antibodies and F(ab') 2 fragments have two antigen combining sites. 
In contrast, Fab, Fv, ScFv and dAb fragments are monovalent, having only one 
antigen combining site. 

10 

Although the antibody may be a polyclonal antibody, it is preferred if it is a 
monoclonal antibody. In some circumstance, particularly if the antibody is 
going to be administered repeatedly to a human patient, it is preferred if the 
monoclonal antibody is a human monoclonal antibody or a humanised 
15 monoclonal antibody. 

Suitable monoclonal antibodies which are reactive as said may be prepared by 
known techniques, for example those disclosed in "Monoclonal Antibodies; A 
manual of techniques", H Zola (CRC Press, 1988) and in "Monoclonal 
20 Hybridoma Antibodies: Techniques and Application", SGR Hurrell (CRC 
Press, 1982). Polyclonal antibodies may be produced which are polypepcific 
or monospecific. It is preferred that they are monospecific. 

Chimaeric antibodies are discussed by Neuberger et al (1998, St* 1 International 
25 Biotechnology Symposium Part 2, 792-799). 
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Suitably prepared non-human antibodies can be "humanised" in known ways, 
for example by inserting the CDR regions of mouse antibodies into the 
framework of human antibodies. 

5 The antibodies may be human antibodies in the sense that they have the amino 
acid sequence of human anti-ECSMl or -ECSM4 antibodies but they may be 
prepared using methods known in the art that do not require immunisation of 
humans. For example, transgenic mice are available which contain, in essence, 
human immunoglobulin genes (see Vaughan et al (1998) Nature Biotechnol 
10 16, 535-539. 

In an alternative embodiment, the moiety capable of selectively binding to a 
polypeptide is a peptide. The ECSM4/magic roundabout polypeptide shows 
homology with the Drosophila, mouse and human roundabout proteins, which 

15 are cell surface receptors for secreted Slit proteins (Li et al (1996) Cell 96:807- 
818). Any cognate ligand for ECSM4/magic roundabout which is capable of 
selectively binding the region of the polypeptide which is located 
extracellularly may be useful. The extracellular region of ECSM4 is likely to 
be located within residues 1-467 of the ECSM4 polypeptide sequence given in 

20 Figure 12. It is believed that certain peptides may be cognate ligands for 
ECSM4. Such a peptide will be a suitable moiety for selectively binding 
ECSM4/magic roundabout. Peptides binding ECSM4 can be identified by 
means of a screen. A suitable method or screen for identifying peptides or 
other molecules which selectively bind ECSM4 may comprise contacting the 

25 ECSM4 polypeptide with a test peptide or other molecule under conditions 
where binding can occur, and then determining if the test molecule or peptide 
has bound ECSM4. Methods of detecting binding between two moieties are 
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well known in the art of biochemistry. Preferably, the known technique of 
phage display is used to identify peptides or other ligand molecules which bind 
to ECSM4. An alternative method includes the yeast two hybrid system. 

5 Peptides or other agents which selectively bind ECSM4 include those which 
modulate or block the function of ECSM4. 

Suitable peptides may be synthesised as described in more detail below. 

10 The further moiety may be any further moiety which confers on the compound 
a useful property with respect to the treatment or imaging or diagnosis of 
diseases or other conditions or states which involve undesirable neovasculature 
formation. Such diseases or other conditions or states are described in more 
detail below. In particular, the further moiety is one which is useful in killing 

15 or imaging neovasculature associated with the growth of a tumour. Preferably, 
the further moiety is one which is able to kill the endothelial cells to which the 
compound is targeted. 

In a preferred embodiment of the invention the further moiety is directly or 
20 indirectly cytotoxic. In particular the further moiety is preferably directly or 
indirectly toxic to cells in neovasculature or cells which are in close proximity 
to and associated with neovasculature. 

By "directly cytotoxic" we include the meaning that the moiety is one which 
25 on its own is cytotoxic. By "indirectly cytotoxic" we include the meaning that 
the moiety is one which, although is not itself cytotoxic, can induce 
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cytotoxicity, for example by its action on a further molecule or by further 
action on it. 

In one embodiment the cytotoxic moiety is a cytotoxic chemotherapeutic agent. 
5 Cytotoxic chemotherapeutic agents are well known in the art. 

Cytotoxic chemotherapeutic agents, such as anticancer agents, include: 
alkylating agents including nitrogen mustards such as mechlorethamine (HN 2 ), 
cyclophosphamide, ifosfamide, melphalan (L-sarcolysin) and chlorambucil; 

10 ethylenimines and methylmelamines such as hexamethylmelamine, thiotepa; 
alkyl sulphonates such as busulfan; nitrosoureas such as carmustine (BCNU), 
lomustine (CCNU), semustine (methyl-CCNU) and streptozocin (streptozotocin); 
and triazenes such as decarbazine (DTIC; dimethyltriazenoimidazole- 
carboxamide); Antimetabolites including fohc acid analogues such as 

15 methotrexate (amethopterin); pyrimidine analogues such as fluorouracil (5- 
fluorouracil; 5-FU), floxuridine (fluorodeoxyuridine; FUdR) and cytarabine 
(cytosine arabinoside); and purine analogues and related inhibitors such as 
mercaptopurine (6-mercaptopurine; 6-MP), thioguanine (6-thioguanine; TG) and 
pentostatin (2'-deoxycoformycin). Natural Products including vinca alkaloids 

20 such as vinblastine (VLB) and vincristine; epipodophyllotoxins such as etoposide 
and teniposide; antibiotics such as dactinomycin (actinomycin D), daunorubicin 
(daunomycin; rubidomycin), doxorubicin, bleomycin, plicamycin (mithramycin) 
and mitomycin (mitomycin C); enzymes such as L-asparaginase; and biological 
response modifiers such as interferon alphenomes. Miscellaneous agents 

25 including platinum coordination complexes such as cisplatin (cw-DDP) and 
carboplatin; anthracenedione such as mitoxantrone and anthracycline; substituted 
urea such as hydroxyurea; methyl hydrazine derivative such as procarbazine (N- 
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methylhydrazine, MIH); and adrenocortical suppressant such as mitotane (o t p y - 
DDD) and aminoglutethimide; taxol and analogues/derivatives; and hormone 
agonists/antagonists such as flutamide and tamoxifen. 

5 Various of these agents have previously been attached to antibodies and other 
target site-delivery agents, and so compounds of the invention comprising 
these agents may readily be made by the person skilled in the art. For example, 
carbodiimide conjugation (Bauminger & Wilchek (1980) Methods Enzymol 
70, 151-159; incorporated herein by reference) may be used to conjugate a 
10 variety of agents, including doxorubicin, to antibodies or peptides. 

Carbodiimides comprise a group of compounds that have the general formula 
R-N=C=N-R', where R and R' can be aliphatic or aromatic, and are used for 
synthesis of peptide bonds. The preparative procedure is simple, relatively 
15 fast, and is carried out under mild conditions. Carbodiimide compounds attack 
carboxylic groups to change them into reactive sites for free amino groups. 

The water soluble carbodiimide, l-ethyl-3-(3-dimethylaminopropyl) 
carbodiimide (EDC) is particularly useful for conjugating a functional moiety 
20 to a binding moiety and may be used to conjugate doxorubicin to tumor 
homing peptides. The conjugation of doxorubicin and a binding moiety 
requires the presence of an amino group, which is provided by doxorubicin, 
and a carboxyl group, which is provided by the binding moiety such as an 
antibody or peptide. 

25 

In addition to using carbodiimides for the direct formation of peptide bonds, 
EDC also can be used to prepare active esters such as N-hydroxysuccinimide 
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(NHS) ester. The NHS ester, which binds only to amino groups, then can be 
used to induce the formation of an amide bond with the single amino group of 
the doxorubicin. The use of EDC and NHS in combination is commonly used 
for conjugation in order to increase yield of conjugate formation (Bauminger & 
Wilchek, supra, 1980). 

Other methods for conjugating a functional moiety to a binding moiety also can 
be used. For example, sodium periodate oxidation followed by reductive 
alkylation of appropriate reactants can be used, as can glutaraldehyde cross- 
linking. However, it is recognised that, regardless of which method of 
producing a conjugate of the invention is selected, a determination must be 
made that the binding moiety maintains its targeting ability and that the 
functional moiety maintains its relevant function. 

In a further embodiment of the invention, the cytotoxic moiety is a cytotoxic 
peptide or polypeptide moiety by which we include any moiety which leads to 
cell death. Cytotoxic peptide and polypeptide moieties are well known in the 
art and include, for example, ricin, abrin, Pseudomonas exotoxin, tissue factor 
and the like. Methods for linking them to targeting moieties such as antibodies 
are also known in the art. The use of ricin as a cytotoxic agent is described in 
Burrows & Thorpe (1993) Proc. Natl Acad. Set USA 90, 8996-9000, 
incorporated herein by reference, and the use of tissue factor, which leads to 
localised blood clotting and infarction of a tumour, has been described by Ran 
et al (1998) Cancer Res. 58, 4646-4653 and Huang et al (1997) Science 275, 
547-550. Tsai et al (1995) Dis, Colon Rectum 38, 1067-1074 describes the 
abrin A chain conjugated to a monoclonal antibody and is incorporated herein 
by reference. Other ribosome inactivating proteins are described as cytotoxic 
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agents in WO 96/06641. Pseudomonas exotoxin may also be used as the 
cytotoxic polypeptide moiety (see, for example, Aiello et al (1995) Proc. Natl. 
Acad. Set. USA 92, 10457-10461; incorporated herein by reference). 

5 Certain cytokines, such as TNFa and EL-2, may also be usefiil as cytotoxic 
agents. 

Certain radioactive atoms may also be cytotoxic if delivered in sufficient doses. 
Thus, the cytotoxic moiety may comprise a radioactive atom which, in use, 

10 delivers a sufficient quantity of radioactivity to the target site so as to be 
cytotoxic. Suitable radioactive atoms include phosphorus-32, iodine-125, 
iodine-131, indium-Ill, rhenium-186, rhenium-188 or yttrium-90, or any other 
isotope which emits enough energy to destroy neighbouring cells, organelles or 
nucleic acid. Preferably, the isotopes and density of radioactive atoms in the 

15 compound of the invention are such that a dose of more than 4000 cGy 
(preferably at least 6000, 8000 or 10000 cGy) is delivered to the target site and, 
preferably, to the cells at the target site and their organelles, particularly the 
nucleus. 

20 The radioactive atom may be attached to the binding moiety in known ways. 
For example EDTA or another chelating agent may be attached to the binding 
moiety and used to attach in In or 90 Y. Tyrosine residues may be labelled with 
125 Ior 131 L 

25 The cytotoxic moiety may be a suitable indirectly cytotoxic polypeptide. In a 
particularly preferred embodiment, the indirectly cytotoxic polypeptide is a 
polypeptide which has eirzymatic activity and can convert a relatively non- 
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toxic prodrug into a cytotoxic drug. When the targeting moiety is an antibody 
this type of system is often referred to as ADEPT (Antibody-Directed Enzyme 
Prodrug Therapy). The system requires that the targeting moiety locates the 
enzymatic portion to the desired site in the body of the patient (ie the site 
5 expressing ECSM1 or ECSM4, such as new vascular tissue associated with a 
tumour) and after allowing time for the enzyme to localise at the site, 
administering a prodrug which is a substrate for the enzyme, the end product of 
the catalysis being a cytotoxic compound. The object of the approach is to 
maximise the concentration of drug at the desired site and to minimise the 
10 concentration of drug in normal tissues (see Senter, P.D. et al (1988) "Anti- 
tumor effects of antibody-alkaline phosphatase conjugates in combination with 
etoposide phosphate" Proc. Natl Acad. Sci. USA 85, 4842-4846; Bagshawe 
(1987) Br. J. Cancer 56, 531-2; and Bagshawe, K.D. et a! (1988) "A cytotoxic 
agent can be generated selectively at cancer sites" Br. J. Cancer. 58, 700-703.) 

15 

Clearly, any ECSM1 or ECSM4 binding moiety may be used in place of an 
anti-ECSMl or anti-ECSM4 antibody in this type of directed enzyme prodrug 
therapy system. 

20 The enzyme and prodrug of the system using an ECSM1 or ECSM4 targeted 
enzyme as described herein may be any of those previously proposed. The 
cytotoxic substance may be any existing anti-cancer drug such as an alkylating 
agent; an agent which intercalates in DNA; an agent which inhibits any key 
enzymes such as dihydrofolate reductase, thymidine synthetase, ribonucleotide 

25 reductase, nucleoside kinases or topoisomerase; or an agent which effects cell 
death by interacting with any other cellular constituent. Etoposide is an 
example of a topoisomerase inhibitor. 
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Reported prodrug systems include: a phenol mustard prodrug activated by an 
E. coli p-glucuronidase (Wang et al, 1992 and Roffler et al, 1991); a 
doxorubicin prodrug activated by a human P-glucuronidase (Bosslet et al, 

5 1994); further doxorubicin prodrugs activated by coffee bean a-galactosidase 
(Azoulay et al, 1995); daunorubicin prodrugs, activated by coffee bean a-D- 
galactosidase (Gesson et al, 1994); a 5-fluorouridine prodrug activated by an E. 
coli p-D-galactosidase (Abraham et al, 1994); and methotrexate prodrugs (eg 
methotrexate-alanine) activated by carboxypeptidase A (Kuefher et al, 1990, 

10 Vitols et al, 1992 and Vitols et al, 1995). These and others are included in the 
following table. 
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Enzyme 


Prodrug 


Carboxypeptidase G2 
Alkaline phosphatase 


Derivatives of L-glutamic acid and benzoic 
acid mustards, aniline mustards, phenol 
mustards and phenylenediamine mustards; 
fluorinated derivatives of these 

Etoposide phosphate 

Mitomycin phosphate 


Beta-glucuronidase 


jP-Hydroxyaniline mustard-glucuronide 
Epirubicin-glucuronide 


Penicillin- V-amidase 


Adriamycin-N phenoxyacetyl 


Penicillin-G-amidase 


N-(4'-hydroxyphenyl acetyl) palytoxin 
Doxorubicin and melphalan 


Beta-lactamase 
Beta-glucosidase 


Nitrogen mustard-cephalosporin 
p-phenylenediamine; doxorubicin denvatives; 
vinblastine derivative-cephalosporin, 

PPTYhslncrnrvriTi mn ctarrl * si i~ck~vr\\ A&TiiTCktl^re* 
^^jJiicii\jiy^j\jLiii lxiiio UuUj a laAUl LiCIlVaLlVC 

Cyanophenylmethyl-beta-D-gluco- 
pyranosiduronic acid 


Nitroreductase 


5-(Azaridin- 1 -yl-)-2,4-dinitrobenzamide 


Cytosine deaminase 


5-Fluorocytosine 


Carboxypeptidase A 


Methotrexate-alanine 
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(This table is adapted from Bagshawe (1995) Drug Dev. Res. 34, 220-230, 
from which full references for these various systems may be obtained; the taxol 
derivative is described in Rodrigues, M.L. et al (1995) Chemistry & Biology 2, 
223). 

5 

Suitable enzymes for forming part of the enzymatic portion of the invention 
include: exopeptidases, such as carboxypeptidases G, Gl and G2 (for 
glutamylated mustard prodrugs), carboxypeptidases A and B (for MTX-based 
prodrugs) and aminopeptidases (for 2-a-aminocyl MTC prodrugs); 

10 endopeptidases, such as eg thrombolysin (for thrombin prodrugs); hydrolases, 
such as phosphatases (eg alkaline phosphatase) or sulphatases (eg aryl 
sulphatases) (for phosphylated or sulphated prodrugs); amidases, such as 
penicillin amidases and arylacyl amidase; lactamases, such as P-lactamases; 
glycosidases, such as P-glucuronidase (for p-glucuronomide anthracyclines), a- 

15 galactosidase (for amygdalin) and p-galactosidase (for p-galactose 
anthracycline); deaminases, such as cytosine deaminase (for 5FC); kinases, 
such as urokinase and thymidine kinase (for gancyclovir); reductases, such as 
nitroreductase (for CB1954 and analogues), azoreductase (for azobenzene 
mustards) and DT-diaphorase (for CB1954); oxidases, such as glucose oxidase 

20 (for glucose), xanthine oxidase (for xanthine) and lactoperoxidase; DL- 
racemases, catalytic antibodies and cyclodextrins. 

The prodrug is relatively non-toxic compared to the cytotoxic drug. Typically, 
it has less than 10% of the toxicity, preferably less than 1% of Hie toxicity as 
25 measured in a suitable in vitro cytotoxicity test. 
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It is likely that the moiety which is able to convert a prodrug to a cytotoxic 
drug will be active in isolation from the rest of the compound but it is 
necessary only for it to be active when (a) it is in combination with the rest of 
the compound and (b) the compound is attached to, adjacent to or internalised 
5 in target cells. 

When each moiety of the compound is a polypeptide, the two portions may be 
linked together by any of the conventional ways of cross-linking polypeptides, 
such as those generally described in O'Sullivan et al (1979) Anal Biochem. 

10 100, 100-108. For example, the ECSM1 or ECSM4 binding moiety may be 
enriched with thiol groups and the further moiety reacted with a bifiinctional 
agent capable of reacting with those thiol groups, for example the N- 
hydroxysuccinimide ester of iodoacetic acid (NHIA) or N-succinimidyl-3-(2- 
pyridyldithio)propionate (SPDP). Amide and thioether bonds, for example 

15 achieved with m-maleimidobenzoyl-N-hydroxysuccinimide ester, are generally 
more stable in vivo than disulphide bonds. 

Alternatively, the compound may be produced as a fusion compound by 
recombinant DNA techniques whereby a length of DNA comprises respective 
20 regions encoding the two moieties of the compound of the invention either 
adjacent one another or separated by a region encoding a linker peptide which 
does not destroy the desired properties of the compound. Conceivably, the two 
portions of the compound may overlap wholly or partly. 

25 The DNA is then expressed in a suitable host to produce a polypeptide 
comprising the compound of the invention. 
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The invention also provides a kit of parts (or a therapeutic system) comprising 
(1) a compound of the invention wherein the further moiety which is able to 
convert a relatively non-toxic prodrug into a cytotoxic drug and (2) a relatively 
non-toxic prodrug. The kit of parts may comprise any of the compounds of the 
5 invention and appropriate prodrugs as herein disclosed. 

The invention also provides a kit of parts (or a therapeutic system) comprising 
(1) a compound of the invention wherein the further moiety is able to bind 
selectively to a directly or indirectly cytotoxic moiety or to a readily detectable 
10 moiety and (2) any one of a directly or indirectly cytotoxic or a readily 
detectable moiety to which the further moiety of the compound is able to bind. 

The cytotoxic moiety may be a radiosensitizer. Radiosensitizers include 
fluoropyrimidines, thymidine analogues, hydroxyurea, gemcitabine, 

15 fludarabine, nicotinamide, halogenated pyrimidines, 3-aminobenzamide, 3- 
aminobenzodiamide, etanixadole, pimonidazole and misonidazole (see, for 
example, McGinn et al (1996) J. Natl Cancer Inst 88, 1193-11203; Shewach 
& Lawrence (1996) Invest New Drugs 14, 257-263; Horsman (1995) Acta 
Oncol 34, 571-587; Shenoy & Singh (1992) Clin. Invest 10, 533-551; 

20 Mitchell et al (1989) Int. J. Radial Biol. 56, 827-836; Iliakis & Kurtzman 
(1989) Int. J. Radiat Oncol. Biol. Phys. 16, 1235-1241; Brown (1989) Int. J. 
Radiat Oncol. Biol Phys. 16, 987-993; Brown (1985) Cancer 55, 2222-2228). 

Also, delivery of genes into cells can radiosensitise them, for example delivery 
25 of the p53 gene or cyclin D (Lang et al (1998) J. Neurosurg. 89, 125-132; 
Coco Martin et al (1999) Cancer Res. 59, 1 134-1 140). 
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The further moiety may be one which becomes cytotoxic, or releases a 
cytotoxic moiety, upon irradiation. For example, the boron- 10 isotope, when 
appropriately irradiated, releases a particles which are cytotoxic (see for 
example, US 4, 348, 376 to Goldenberg; Primus et al (1996) Bioconjug. Chem. 
5 7, 532-535). 

Similarly, the cytotoxic moiety may be one which is useful in photodynamic 
therapy such as photofrin (see, for example, Dougherty et al (1998) 1 Natl 
Cancer Inst 90, 889-905). 

10 

The further moiety may comprise a nucleic acid molecule which is directly or 
indirectly cytotoxic. For example, the nucleic acid molecule may be an 
antisense oligonucleotide which, upon localisation at the target site is able to 
enter cells and lead to their death. The oligonucleotide, therefore, may be one 
15 which prevents expression of an essential gene, or one which leads to a change 
in gene expression which causes apoptosis. 

Examples of suitable oligonucleotides include those directed at bcl-2 (Ziegler 
et al (1997) J. Natl Cancer Inst 89, 1027-1036), and DNA polymerase a and 
20 topoisomerase Ila (Lee et al (1996) Anticancer Res. 16, 1805-181 1. 

Peptide nucleic acids may be useful in place of conventional nucleic acids (see 
Knudsen & Nielsen (1997) Anticancer Drugs 8, 113-118). 

25 In a further embodiment, the binding moiety may be comprised in a delivery 
vehicle for delivering nucleic acid to the target. The delivery vehicle may be 
any suitable delivery vehicle. It may, for example, be a liposome containing 
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nucleic acid, or it may be a virus or virus-like particle which is able to deliver 
nucleic acid. In these cases, the moiety which selectively binds to ECSM1 or 
ECSM4 is typically present on the surface of the delivery vehicle. For 
example, the moiety which selectively binds to ECSM1 or ECSM4, such as a 
suitable antibody, fragment, may be present in the outer surface of a liposome 
and the nucleic acid to be delivered may be present in the interior of the 
liposome. As another example, a viral vector, such as a retroviral or adenoviral 
vector, is engineered so that the moiety which selectively binds to ECSM1 or 
ECSM4 is attached to or located in the surface of the viral particle thus 
enabling the viral particle to be targeted to the desired site. Targeted delivery 
systems are also known such as the modified adenovirus system described in 
WO 94/10323 wherein, typically, the DNA is carried within the adenovirus, or 
adenovirus-like, particle. Michael et al (1995) Gene Therapy 2, 660-668 
describes modification of adenovirus to add a cell-selective moiety into a fibre 
protein. Targeted retroviruses are also available for use in the invention; for 
example, sequences conferring specific binding affinities may be engineered 
into preexisting viral env genes (see Miller & Vile (1995) Faseb J. 9, 190-199 
for a review of this and other targeted vectors for gene therapy). 

Immunoliposomes (antibody-directed liposomes) may be used in which the 
moiety which selectively binds to ECSM1 or ECSM4 is an antibody. For the 
preparation of immuno-liposomes MPB-PE (N-[4-(p- 
maleimidophenyl)butyryl]-phosphatidylethanolamine) is synthesised according 
to the method of Martin & Papahadjopoulos (1982) X Biol Chem. 257, 286- 
288. MPB-PE is incorporated into the liposomal bilayers to allow a covalent 
coupling of the anti-ECSMl or -ECSM4 antibody, or fragment thereof, to the 
liposomal surface. The liposome is conveniently loaded with the DNA or other 
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genetic construct for delivery to the target cells, for example, by forming the 
said liposomes in a solution of the DNA or other genetic construct, followed by 
sequential extrusion through polycarbonate membrane filters with 0.6 jxm and 
0.2 \xm pore size under nitrogen pressures up to 0.8 MPa. After extrusion, 
5 entrapped DNA construct is separated from free DNA construct by 
ultracentrifiigation at 80 000 x g for 45 min. Freshly prepared MPB-PE- 
liposomes in deoxygenated buffer are mixed with freshly prepared antibody (or 
fragment thereof^ and the coupling reactions are carried out in a nitrogen 
atmosphere at 4°C under constant end over end rotation overnight. The 
10 immunoliposomes are separated from unconjugated antibodies by 
ultracentrifiigation at 80 000 x g for 45 min. Immunoliposomes may be 
injected intraperitoneally or directly into the tumour. 

The nucleic acid delivered to the target site may be any suitable DNA which 
15 leads, directly or indirectly, to cytotoxicity. For example, the nucleic acid may 
encode a ribozyme which is cytotoxic to the cell, or it may encode an enzyme 
which is able to convert a substantially non-toxic prodrug into a cytotoxic drug 
(this latter system is sometime called GDEPT: Gene Directed Enzyme Prodrug 
Therapy). 

20 

Ribozymes which may be encoded in the nucleic acid to be delivered to the 
target are described in Cech and Herschlag "Site-specific cleavage of single 
stranded DNA" US 5,180,818; Altaian et al "Cleavage of targeted RNA by 
RNAse P" US 5,168,053, Cantin et al "Ribozyme cleavage of HIV-1 RNA" 
25 US 5,149,796; Cech et al "RNA ribozyme restriction endoribonucleases and 
methods", US 5,116,742; Been et al "RNA ribozyme polymerases, 
dephosphorylases, restriction endonucleases and methods", US 5,093,246; and 
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Been et al "RNA ribozyme polymerases, dephosphorylases, restriction 
endoribonucleases and methods; cleaves single-stranded RNA at specific site 
by transesterification", US 4,987,071, all incorporated herein by reference. 
Suitable targets for ribozymes include transcription factors such as c-fos and c- 
5 myc, and bcl-2. Durai et al (1997) Anticancer Res. 17, 3307-3312 describes a 
hammerhead ribozyme against bcl-2. 

EP 0 415 73 1 describes the GDEPT system. Similar considerations concerning 
the choice of enzyme and prodrug apply to the GDEPT system as to the 
10 ADEPT system described above. 



The nucleic acid delivered to the target site may encode a directly cytotoxic 
polypeptide. 

15 Alternatively, the further portion may comprise a polypeptide or a 
polynucleotide encoding a polypeptide which is not either directly or indirectly 
cytotoxic but is of therapeutic benefit. Examples of such polypeptides include 
anti-proliferative or anti-inflammatory cytokines which could be of benefit in 
artherosclerosis, and anti-proliferative, immunomodulatory or factors 

20 influencing blood clotting may be of benefit in treating cancer. 

The further moiety may usefully be an inhibitor of angiogenesis such as the 
peptides angiostatin or endostatin. The further moiety may also usefully be an 
enzyme which converts a precursor polypeptide to angiostatin or endostatin. 
25 Human matrix metallo-proteases such as macrophage elastase, gelatinase and 
stromolysin convert plasminogen to angiostatin (Cornelius et al (1998) J. 
Immunol 161,6845-6852). Plasminogen is a precursor of angiostatin. 
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In a further embodiment of the invention, the further moiety comprised in the 
compound of the invention is a readily detectable moiety. 

5 By a "readily detectable moiety" we include the meaning that the moiety is one 
which, when located at the target site following administration of the 
compound of the invention into a patient, may be detected, typically non- 
invasively from outside the body and the site of the target located. Thus, the 
compounds of this embodiment of the invention are useful in imaging and 
10 diagnosis. 

Typically, the readily detectable moiety is or comprises a radioactive atom 
. which is useful in imaging. Suitable radioactive atoms include technetium- 
99m or iodine-123 for scintigraphic studies. Other readily detectable moieties 
15 include, for example, spin labels for magnetic resonance imaging (MRI) such 
as iodine-123 again, iodine-131, indium-Ill, fluorine-19, carbon-13, nitrogen- 
15, oxygen-17, gadolinium, manganese or iron. Clearly, the compound of the 
invention must have sufficient of the appropriate atomic isotopes in order for 
the molecule to be readily detectable. 

20 

The radio- or other labels may be incorporated in the compound of the 
invention in known ways. For example, if the binding moiety is a polypeptide 
it may be biosynthesised or may be synthesised by chemical amino acid 
synthesis using suitable amino acid precursors involving, for example, 
25 fluorine-19 in place of hydrogen. Labels such as 99m Tc, 123 I, 186 Rh, 188 Rh and 
H1 In can, for example, be attached via cysteine residues in the binding moiety. 
Yttrium-90 can be attached via a lysine residue. The IODOGEN method 
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(Fraker er al (1978) Biochem. Biophys. Res. Comm. 80, 49-57) can be used to 
incorporate iodine-123. Reference ("Monoclonal Antibodies in 
Immunoscintigraphy", J-F Chatal, CRC Press, 1989) describes other methods 
in detail. 

5 

In a further preferred embodiment of the invention the further moiety is able to 
bind selectively to a directly or indirectly cytotoxic moiety or to a readily 
detectable moiety. Thus, in this embodiment, the further moiety may be any 
moiety which binds to a further compound or component which is cytotoxic or 
10 readily detectable. 

The further moiety may, therefore be an antibody which selectively binds to 
the further compound or component, or it may be some other binding moiety 
such as streptavidin or biotin or the like. The following examples illustrate the 
15 types of molecules that are included in the invention; other such molecules are 
readily apparent from the teachings herein. 

A bispecific antibody wherein one binding site comprises the moiety which 
selectively binds to ECSM1 or ECSM4 and the second binding site comprises a 
20 moiety which binds to, for example, an enzyme which is able to . convert a 
substantially non-toxic prodrug to a cytotoxic drug. 

A compound, such as an antibody which selectively binds to ECSM1 or 
ECSM4, to which is bound biotin. Avidin or streptavidin which has been 
25 labelled with a readily detectable label may be used in conjunction with the 
biotin labelled antibody in a two-phase imaging system wherein the biotin 
labelled antibody is first localised to the target site in the patient, and then the 
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labelled avidin or streptavidin is administered to the patient. Bispecific 
antibodies and biotin/streptavidin (avidin) systems are reviewed by 
Rosebrough (1996) QJNucl Med. 40, 234-251. 

In a preferred embodiment of the invention, the moiety which selectively binds 
to ECSM1 or ECSM4 and the further moiety are polypeptides which are fused. 

The compounds of the first and second aspects of the invention are useful in 
treating, imaging or diagnosing disease, particularly diseases in which there 
may be undesirable neovasculature formation, as described in more detail 
below. 

In a preferred embodiment of the first and second aspects of the invention, the 
compounds are suitable for use in medicine. 

A third aspect of the invention provides a nucleic acid molecule encoding a 
compound of either the first or second aspects of the invention wherein the 
selective binding moiety and the further moiety are polypeptides which are 
fused. 

Methods of linking polynucleotides are described in more detail below. 

A fourth aspect of the invention provides a pharmaceutical composition 
comprising a compound according to the invention and a pharmaceutically 
acceptable carrier. The compound of the invention includes those described in 
the first, second and third aspects. The invention also includes pharmaceutical 
composition comprising any of an antibody, polypeptide, peptide, 
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polynucleotide, expression vector or other agent which may be delivered to an 
individual as described below and apharmaceutically acceptable carrier. 

By "pharmaceutical^ acceptable" is included that the formulation is sterile and 
pyrogen free. Suitable pharmaceutical carriers are well known in the art of 
pharmacy. 

The carrier(s) must be "acceptable" in the sense of being compatible with the 
compound of the invention and not deleterious to the recipients thereof. 
Typically, the carriers will be water or saline which will be sterile and pyrogen 
free; however, other acceptable carriers may be used. 

Typically the pharmaceutical compositions or formulations of the invention are 
for parenteral administration, more particularly for intravenous administration. 

Formulations suitable for parenteral administration include aqueous and non- 
aqueous sterile injection solutions which may contain antioxidants, buffers, 
bacteriostats and solutes which render the formulation isotonic with the blood 
of the intended recipient; and aqueous and non-aqueous sterile suspensions 
which may include suspending agents and thickening agents. 

A fifth aspect of the invention provides a method of imaging vascular 
endothelium in the body of an individual the method comprising administering 
to the individual an effective amount of a compound according to either of the 
first or second aspects of the invention wherein the further moiety is a readily 
detectable moiety. 
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Typically the vascular endothelium is associated with angiogenesis. 

As discussed above in relation to the first and second aspects of the invention, 
the moiety of the compound which selectively binds ECSM4 or ECSM1 may 
5 be an antibody. Preferred antibodies are as outlined above. 

In a preferred embodiment of this aspect of the invention, the method of 
imaging the vascular endothelium in an individual comprises the further step of 
detecting the location of the compound in the individual. 

10 

Detecting the compound or antibody can be achieved using methods well 
known in the art of clinical imaging and diagnostics. The specific method 
required will depend on the type of detectable label attached to the compound 
or antibody. For example, radioactive atoms may be detected using 
15 autoradiography or in some cases by magnetic resonance imaging (MRI) as 
described above. 

Imaging the vascular endothelium in the body is useful because it can provide 
information about the health of the body. It is particularly useful when the 
20 vascular endothelium is diseased, or is proliferating due to a cancerous growth. 
Imaging cancer in a patient is especially useful, because it can be used to 
determine the size of a tumour and whether it is responding to treatment. Since 
metastatic disease involves new blood vessel formation, the method is useful in 
assessing whether metastasis has occurred. 

25 

Hence, in a preferred embodiment of the fifth aspect of the invention, the 
vascular endothelium is neovasculature, such as that produced in cancer. 
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A sixth aspect of the invention provides a method of diagnosing or prognosing 
in an individual a condition which involves the vascular endothelium the 
method comprising administering to the individual an effective amount of a 
5 compound according to either of the first or second aspects of the invention 
wherein the further moiety is a readily detectable moiety. 

The condition may be one which involves aberrant or excessive growth of 
vascular endothelium, such as cancer, atherosclerosis, restenosis, diabetic 
10 retinopathy, arthritis, psoriasis, endometriosis, menorrhagia, haemangiomas 
and venous malformations. 

As discussed in relation to the first and second aspects of the invention, the 
compound may comprise an antibody. The antibody may be any antibody 
15 which selectively binds the polypeptide ECSM1 or ECSM4 as required. 
Preferred antibodies which bind the polypeptide ECSM4 are as outlined above. 

The method may be one which is an aid to diagnosis. 

20 In a preferred embodiment of this aspect of the invention, the method of 
diagnosing, or aiding diagnosis of, a condition involving the vascular 
endothelium in an individual comprises the further step of detecting the 
location of the compound in the individual. Preferably the endothelium is in 
neovasculature; ie, angiogenic vasculature. 

25 

The function of ECSM4 or ECSM1 may not be to promote proliferation of 
vascular endothelial cells. Therefore the level of expression of these 



WO 02/36771 




PCT/GB01/04906 



34 

polypeptides within an endothelial cell may not be informative about the health 
of the vascular endothelium. However, the location of expression of the 
polypeptides may be informative, as they represent the growth of blood 
vessels. Abnormal cell proliferation such as cancer may be diagnosed by the 
5 detection of new vasculature. 

A seventh aspect of the invention provides a method of treating an individual 
in need of treatment, the method comprising administering to the individual an 
effective amount of a compound according to the first or second aspects of the 
10 invention wherein the further moiety is a cytotoxic or therapeutic moiety. 

In one embodiment of this aspect, the patient in need of treatment has a 
proliferative disease or a condition involving the vascular endothelium. 

15 A number of diseases and conditions involve undesirable neovasculature 
formation. Neovasculature formation is associated with cancer, psoriasis, 
atherosclerosis, menorrhagia, arthritis (both inflammatory and rheumatoid), 
macular degeneration, Pagefs disease, retinopathy and its vascular 
complications (including proliferative and of prematurity, and diabetic), benign 

20 vascular proliferations and fibroses. 

By cancer is included Kaposi's sarcoma, leukaemia, lymphoma, myeloma, 
solid carcinomas (both primary and secondary (metastasis), vascular tumours 
including haemangioma (both capillary and juvenile (infantile)), 
25 haemangiomatosis and haemagioblastoma. 
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Thus, the invention comprises a method of treating a patient who has a disease 
in which angiogenesis contributes to pathology the method comprising the step 
of administering to the patient an effective amount of a compound of the first 
or second aspect of the invention wherein the further moiety of the compound 
5 is one which either directly or indirectly is of therapeutic benefit to the patient. 

Typically, the disease is associated with undesirable neovasculature formation 
and the treatment reduces this to a useful extent. 

10 The tumours that may be treated by the methods of the invention include any 
tumours which are associated with new blood vessel production. 

The term "tumour" is to be understood as referring to all forms of neoplastic 
cell growth, including tumours of the lung, liver, blood cells, skin, pancreas, 
15 stomach, colon, prostate, uterus, breast, lymph glands and bladder. Solid 
tumours are especially suitable. However, blood cancers, including leukaemias 
and lymphomas are now also believed to involve new blood vessel formation 
and may be treated by the methods of the invention. 

20 Typically in the above-mentioned methods of treatment, the further moiety is 
one which destroys or slows or reverses the growth of the neovasculature. 

It will readily be appreciated that, depending on the particular compound used 
in imaging, diagnosis or treatment, the timing of administration may vary and 
25 the number of other components used in therapeutic systems disclosed herein 
may vary. 
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For example, in the case where the compound of the invention comprises a 
readily detectable moiety or a directly cytotoxic moiety, it may be that only the 
compound, in a suitable formulation, is administered to the patient. Of course, 
other agents such as immunosuppressive agents and the like may be 
5 administered. 

In respect of compounds which are detectably labelled, imaging takes place 
once the compound has localised at the target site. 

10 However, if the compound is one which requires a further component in order 
to be useful for treatment, imaging or diagnosis, the compound of the invention 
may be administered and allowed to localise at the target site, and then the 
further component administered at a suitable time thereafter. 

15 For example, in respect of the ADEPT and ADEPT-like systems above, the 
binding moiety-enzyme moiety compound is administered and localises to the 
target site. Once this is done, the prodrug is administered. 

Similarly, for example, in respect of the compounds wherein the further moiety 
20 comprised in the compound is one which binds a further component, the 
compound may be administered first and allowed to localise at the target site, 
and subsequently the further component is administered. 

Thus, in one embodiment a biotin-labelled anti-ECSMl or -ECSM4 antibody is 
25 administered to the patient and, after a suitable period of time, detectably 
labelled streptavidin is administered. Once the streptavidin has localised to the 
sites where the antibody has localised (ie the target sites) imaging takes place. 
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Where the compound whose moiety which selectively binds is an antibody, the 
antibody may be any antibody which selectively binds the polypeptide 
ECSM1 or ECSM4 as required. Preferred antibodies are as outlined in the 
5 first and second aspects of the invention. 

It is believed that the compounds of the invention wherein the further moiety is 
a readily detectable moiety may be useful in determining the angiogenic status 
of tumours or other disease states in which angiogenesis contributes to 
10 pathology. This may be an important factor influencing the nature and 
outcome of future therapy. 

An eighth aspect of the invention provides a method of introducing genetic 
material selectively into vascular endothelial cells the method comprising 
15 contacting the cells with a compound according to either of the first or second 
aspects of the invention as described above wherein the further moiety is a 
nucleic acid. 

The vascular endothelial cells may be any vascular endothelial cells such as 
20 those in tissue culture or in a living organism. It is preferred if the cells are in a 
living organism. It is further preferred if the organism is a human. It is still 
more preferred if the vascular endothelial cells are those in neovasculature, ie 
they are angiogenic endothelial cells. 

25 Preferably, the binding moiety is an antibody. The antibody may be any 
antibody which selectively binds the polypeptide ECSM1 or ECSM4 as 
required. Preferably, the antibody is one as defined above in relation to the 
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first or second aspects of the invention. Typically, the binding moiety is 
comprised in a delivery vehicle and preferably, the delivery vehicle is a 
liposome, as described in further detail above. In this embodiment, the further 
moiety is nucleic acid and is comprised within the liposome, also as described 
above. Typically, the method is used in gene therapy, and the genetic material 
is therapeutically useful. Therapeutically useful genetic material includes that 
which encodes a therapeutic protein. 

A ninth aspect of the invention provides a use of a compound according to 
either of the first or second aspects of the invention wherein the further moiety 
is a readily detectable label in the manufacture of a diagnostic or prognostic 
agent for a condition which involves the vascular endothelium. 

As discussed above, the compound may comprise an antibody as the moiety 
which selectively binds. The antibody may be any antibody which selectively 
binds the polypeptide ECSM1 or ECSM4 as required. 

A tenth aspect of the invention provides a use of a compound according to 
either of the first or second aspects of the invention wherein the further moiety 
is a cytotoxic or therapeutic moiety in the manufacture of a medicament for 
treating a condition involving the vascular endothelium. 

Conditions which involve the vascular endothelium are described above. 

As described above, the compound may comprise an antibody as the moiety 
which selectively binds. The antibody may be any suitable antibody which 
selectively binds the polypeptide ECSM1 or ECSM4 as required. 
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An eleventh aspect of the invention provides a polypeptide comprising or 
consisting of a fragment or variant or fusion of the ECSM4 polypeptide or a 
fusion of said fragment or variant provided that it is not a polypeptide 
5 consisting of the amino acid sequence given between residues 49 and 466 of 
Figure 4. 

The ECSM4 polypeptide includes a polypeptide comprising or consisting of 
the amino acid sequence given in Figure 4 or Figure 5 or Figure 7 or Figure 12 
10 or Figure 13 or the polypeptide encoded by the nucleotide sequence of either 
Figure 4 between positions 1 and 1395 or Figure 5 between positions 2 and 948 
. or Figure 7 or Figure 12 or Figure 13 is that of the ECSM4 polypeptide. 
Preferably, the ECSM4 polypeptide of the invention comprises but does not 
consist of the amino acid sequence given in Figure 4. 

15 

Preferably, the ECSM4 polypeptide of the invention does not consist of any of 
the amino acid sequences represented by SEQ ID No 18085 of EP 1 074 617, 
SEQ ID No 211 of either WO 00/53756 or W099/46281, SEQ ID Nos 24-27, 
29, 30, 33, 34, 38 or 39 of WO 01/23523, or SEQ ID No 86 of WO 99/11293, 
20 or any of the amino acid sequences encoded by SEQ ID No 18084 or 5096 of 
EP 1 074 617, SEQ ID No 210 of WO 00/53756 or WO 99/46281, or SEQ ID 
Nos 22, 23, 96 or 98 of WO 01/23523 or SEQ ID No 31 of WO 99/11293. 

A twelfth aspect of the invention provides a polypeptide comprising or 
25 consisting of the ECSM1 polypeptide or a fragment or variant or fusion thereof 
or a fusion of said fragment or variant. 
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The ECSM1 polypeptide includes a polypeptide comprising or consisting of 
the amino acid sequence given in Figure 2. Preferably, the ECSM1 
polypeptide or fragment is not a polypeptide whose sequence is given in SEQ 
ID No 120 of WO 99/06423 or which is encoded by SEQ ID No 32 of WO 
99/06423 or encoded by the nucleic acid of ATCC deposit No 209145 made on 
July 17 1997 for the purposes of WO 99/06423. 

The invention includes peptides which are derived from the ECSM4 or ECSM1 
polypeptides. These peptides may be considered "fragments" of the ECSM4 or 
ECSM1 polypeptides but may be produced by de novo synthesis or by 
fragmentation of the polypeptide. 

"Fragments" of the ECSM4 or ECSM1 polypeptide include polypeptides 
which comprise at least five consecutive amino acids of the ECSM4 or ECSM1 
polypeptide. Preferably, a fragment of the polypeptide comprises an amino 
acid sequence which is useful, for example, a fragment which retains activity 
of the polypeptide, or a fragment for use in a binding assay or is useful as a 
peptide for producing an antibody which is specific for the ECSM4 or ECSM1 
polypeptide. An activity of the ECSM4 polypeptide may be in endothelial cell 
repulsive guidance. Repulsive guidance may be tested in vivo by constructing 
appropriate transgenic or knock-out animal models, for example mice or 
zebrafish. It may also be tested in vivo on cell migration assays such as 
Boyden chamber or video microscopy. Typically, the fragments have at least 8 
consecutive amino acids, preferably at least 10, more preferably at least 12 or 
15 or 20 or 30 or 40 or 50 consecutive amino acids of the ECSM4 or ECSM1 
polypeptide. Preferably, fragments of the ECSM4 polypeptide comprise but do 
not consist of the amino acid sequence given in Figure 4 or Figure 5 or Figure 
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7 or Figure 12 or Figure 13. Preferably, fragments of the ECSM4 polypeptide 
comprise but do not consist of any of the amino acid sequences represented by 
SEQ ID No 18085 of EP 1 074 617, SEQ ID No 211 of either WO 00/53756 or 
W099/46281, SEQ ID Nos 24-27, 29, 30, 33, 34, 38 or 39 of WO 01/23523, or 
5 SEQ ID No 86 of WO 99/1 1293, or any of the amino acid sequences encoded 
by SEQ ID No 18084 or 5096 of EP 1 074 617, SEQ ID No 210 of WO 00 
53756 or WO 99/46281, or SEQ ID Nos 22, 23, 96 or 98 of WO 01/23523 or 
SEQ ID No 3 1 of WO 99/1 1293. 

10 Typically, the fragments of ECSM4 polypeptide are ones which have portions 
of the amino acid sequence shown in Figure 4 or Figure 12. 

Typically, the fragments of ECSM1 polypeptide are ones which have portions 
of the amino acid sequence shown in Figure 2. 

15 

In a preferred embodiment of the thirteenth aspect of the invention, a fragment 
of the ECSM4 polypeptide is a fragment which has the sequence 
LSQSPGAVPQALVAWRA, DSVLTPEEVALCLEL, TYGYISVPTA, 
KGGVLLCPPRPCLTPT, WLADTW, WLADTWRSTSGSRD, 

20 SPPTTYGYIS, GSLANGWGSASEDNAASARASLVSSSDGSFLAD or 
FARALAVAVD or has a sequence of at least 5 or 8 or 10 residues of any of 
these sequences. These peptides correspond to amino acids 165-181, 274-288, 
311-320, 336-351, 8-13, 8-21, 307-316, 355-387 and 390-399 respectively of 
the human ECSM4 polypeptide shown in Figure 4. Peptides WLADTW, 

25 WLADTWRSTSGSRD, SPPTTYGYIS, 
GSLANGWGSASEDNAASARASLVSSSDGSFLAD and FARALAVAVD 
represent conserved regions between the mouse and human homologues of the 
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ECSM4 polypeptide, and between the ECSM4 polypeptide and the mouse 
duttl protein. The peptides LSQSPGAVPQALVAWRA, 

DSVLTPEEVALCLEL, TYGYISVPTA and KGGVLLCPPRPCLTPT may be 
useful in raising antibodies. 

5 

Preferred peptides are peptides of at least 5 or 8 or 10 or 12 or 15 or 20 
consecutive amino acid residues from these conserved sequences. Peptides of 
ECSM4 which affect cell migration and/or growth and/or vascular 
development are particularly preferred. They can be identified in suitable 
10 screening systems. 

In a further preferred embodiment of this aspect of the invention, a fragment of 
the ECSM4 polypeptide is a fragment which has the sequence 
GGDSLLGGRGSL, LLQPPARGHAHDGQALSTDL, EPQDYTEPVE, 

15 TAPGGQGAPWAEE or ERATQEPSEHGP or has a sequence of at least 5 or 
8 or 10 residues of any of these sequences. These peptides correspond to 
regions of the human ECSM4 polypeptide (located at residues 4-16, 91-109, 
227-236, 288-300 and 444-455 respectively in the sequence given in Figure 12) 
which are not, or are poorly, conserved in the mouse homologue (see Figure 

20 14). As described below, such peptides may be particularly useful in raising 
antibodies to the human ECSM4 polypeptide. 

According to the transmembrane domain predicting software program called 
PRED-TMR (available at the internet site http://www.biophys.biol.uoa.gr) and 
25 an amino acid sequence alignment with the human protein Robol (whose 
transmembrane region is known), residues 1-467 as shown in Figure 12 are 
likely to be extracellular, and in addition to being extracellularly exposed, may 
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include the binding site of the natural ligand. Hence fragments of ECSM4 
which include or consist of a sequence within the extracellular domain of 
residues 1-467 of Figure 12 may represent useful fragments for raising 
antibodies selective for cells expressing ECSM4 on their surface and which 
5 may also be useful in modulating the activity of the polypeptide ECSM4. 

Hence, preferred fragments of the ECSM4 polypeptide are those fragments of 
the polypeptide sequence of Figure 12 which comprise at least 1, 3 or 5, amino 
acid residues which are not conserved when compared to the mouse ECSM4 

10 (as shown in Figure 13). More preferably at least 7, 9, 11 or 13 amino acid 
residues in the fragment are not conserved between human ECSM4 and mouse 
ECSM4, and still more preferably at least 15, 17, 19 or 21 residues of the 
fragment are not conserved between human ECSM4 and mouse ECSM4 The 
sequence of such fragments may be determined from the alignment of the 

15 human and mouse amino acid sequences shown in Figure 14. 

It will be appreciated that fragments of the ECSM4 or ECSM1 polypeptide of 
the invention are particularly useful when fused to other polypeptides, such as 
glutathione-S-transferase (GST), green fluorescent protein (GFP), vesicular 
20 stomatitis virus glycoprotein (VSVG) or keyhole limpet haemacyanin (KLH). 
Fusions of the polypeptide, or fusions of fragments or variants of the 
polypeptide of the invention are included in the scope of the invention. 

Other useful fragments of ECSM4 are those which are able to bind a ligand 
25 selective for ECSM4. Suitable methods for identification of ligands such as 
peptides or other molecules which bind ECSM4 is discussed in more detail 
above. Such peptides or other ECSM4-binding molecules can be used to 
identify the amino acid sequences present in ECSM4 which are responsible for 
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ligand binding. Identification of those fragments of ECSM4 which, when 
isolated from the rest of the molecule, are still able to bind a ligand of ECSM4 
can be achieved by means of a screen. Typically, such a screen will comprise 
contacting a ligand of ECSM4 with a test fragment of the ECSM4 polypeptide 
5 and determining if the test fragment binds the ligand. Fragments of ECSM4 
are within the scope of the invention, and may be particularly useful in 
medicine. A fragment of ECSM4 which binds the natural ECSM4 ligand may 
neutralise the effect of the ligand and thereby affect endothelial cell migration, 
growth and/or vascular development. Hence, administration of fragments of 
10 ECSM4 may be useful in the treatment of diseases or conditions where 
endothelial cell migration, growth and/or vascular development need , to be 
modulated. Examples of such diseases include cancer and artherosclerosis. 

A "fusion" of the ECSM4 or ECSM1 polypeptide or a fragment or variant 
15 thereof provides a molecule comprising a polypeptide of the invention and a 
further portion. It is preferred that the said further portion confers a desirable 
feature on the said molecule; for example, the portion may useful in detecting 
or isolating the molecule, or promoting cellular uptake of the molecule. The. 
portion may be, for example, a biotin moiety, a radioactive moiety, a 
20 fluorescent moiety, for example a small fluorophore or a green fluorescent 
protein (GFP) fluorophore, as well known to those skilled in the art. The 
moiety may be an immunogenic tag, for example a Myc tag, as known to those 
skilled in the art or may be a lipophilic molecule or polypeptide domain that is 
capable of promoting cellular uptake of the molecule or the interacting 
25 polypeptide, as known to those skilled in the art. 
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A "variant" of the ECSM4 or ECSM1 polypeptide includes natural variants, 
including allelic variants and naturally-occurring mutant forms and variants 
with insertions, deletions and substitutions, either conservative or non- 
conservative, where such changes do not substantially alter the activity of the 
5 said polypeptide. In the case of the ECSM4 polypeptide, as an endothelial 
specific homologue of the human roundabout 1 it may well be involved in 
endothelial cell repulsive guidance. In addition, polypeptides which are 
elongated as a result of an insertion or which are truncated due to deletion of a 
region are included in the scope of the invention. For example, deletion of 
10 cytoplasmically-located regions may be useful in creation of "dominant 
negative" or "dominant positive" forms of the polypeptide. Similarly, deletion 
of a transmembrane region of the polypeptide may produce such forms. 

By "conservative substitution" is intended combinations such as Gly, Ala; Val, 
15 He, Leu; Asp, Glu; Asn, Gin; Ser, Thr; Lys, Arg; and Phe, Tyr. 

By "non-conservative substitution" we include other substitutions, such as 
those where the substituted residue mimics a particular modification of the 
replaced residue, for example a phosphorylated tyrosine or serine may be 
20 replaced by aspartate or glutamate due to the similarity of the aspartate or 
glutamate side chain to a phosphorylated residue (ie they carry a negative 
charge at neutral pH). 

Further non-conservative substitutions which are included in the term 
25 "variants" are point mutations which alter one, sometimes two, and usually no 
more than three amino acids. Such mutations are well known in the art of 
biochemistry and are usually designed to insert or remove a defined 
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characteristic of the polypeptide. Another type of non-conservative mutation is 
the alteration or addition of a residue to a cysteine or lysine residue which can 
then be used with maleimide or succinimide cross-linking reagents to 
covalently conjugate the polypeptide to another moiety. Non-glycosylated 
5 proteins may be mutated to convert an asparagine to the recognition motif N- 
X-S/T for N-linked glycosylation. Such a modification may be useful to create 
a tag for purification of the polypeptide using Concanavalin A-linked beads. 

Such variants may be made using the methods of protein engineering and site- 
10 directed mutagenesis well known in the art. 

Variants of the ECSM4 polypeptide include polypeptides comprising a 
sequence with at least 65% identity to the amino acid sequence given in Figure 
4 or Figure 7 or Figure 12 or Figure 13, preferably at least 70% or 80% or 85% 
15 or 90% identity to said sequence, and more preferably at least 95% or 98% 
identity to said amino acid sequence. 

Variants of the ECSM1 polypeptide include polypeptides comprising a 
sequence with at least 65% identity to the amino acid sequence given in Figure 
20 2, preferably at least 70% or 80% or 85% or 90% identity to said sequence, and 
more preferably at least 95% or 98% identity to said amino acid sequence. 

Percent identity can be determined by, for example, the LALIGN program 
(Huang and Miller, Adv. Appl Math. (1991) 12:337-357) at the Expasy facility 
25 site (http://www.ch.embnet.org/software/LALIGN form.html) using as 
parameters the global alignment option, scoring matrix BLOSUM62, opening 
gap penalty -14, extending gap penalty -4. 
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A thirteenth aspect of the invention provides a polynucleotide encoding the 
ECSM4 polypeptide of the invention, or the complement thereof or a 
polynucleotide which selectively hybridises to either of these which 
5 polynucleotide is not any one of the clones corresponding to GenBank 
Accession No AK000805 or the ESTs whose GenBank Accession Nos are 
given in Table 1 1 or Table 12. 

GenBank Accession No AK000805 corresponds to a cDNA sequence cloned in 
10 the vector pME18SFL3. ESTs listed in Table 11 represent nucleotide 
sequences which can be assembled into the contig sequence shown in Figure 5. 
ESTs listed in Table 12 represent nucleotide sequences which can be 
assembled into the mouse nucleotide cluster sequence (Mm.27782) given in 
Figure 7. 

15 

Preferably, the polynucleotide of this aspect of the invention does not consist 
of any one of the nucleotide sequences representedjjy SEQ ID No 18084 or 
5096 of EP 1 074 617, SEQ ID No 210 of WO 00 53756 or WO 99/46281, or 
. SEQ ID Nos 22, 23, 96 or 98 of WO 01/23523 or SEQ ID No 31 of WO 
20 99/1 1293, or their complement. 

Also preferably, the polynucleotide of this aspect of the invention is not a 
polynucleotide which encodes a polypeptide consisting of the amino acid 
sequence represented by any one of SEQ ID No 18085 of EP 1 074 617, SEQ 
25 ID No 21 1 of either WO 00/53756 or W099/46281, SEQ ID Nos 24-27, 29, 
30, 33, 34, 38 or 39 of WO 01/23523, or SEQ ID No 86 of WO 99/1 1293 
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Polynucleotides of the thirteenth aspect of the invention are described in more 
detail below. 

A fourteenth aspect of the invention provides a polynucleotide encoding the 
5 ECSM1 polypeptide or the complement thereof or a polynucleotide which 
selectively hybridises to either of these, according to the twelfth aspect of the 
invention provided that the polynucleotide is not one present in ATCC deposit 
No 209145 or the clone corresponding to GenBank Accession No AC011526 
or the ESTs whose GenBank Accession Nos are given in Table 10. 

10 

By "encoding a polypeptide according to the twelfth aspect of the invention" 
we mean that the polynucleotide is one which encodes an ECSM1 polypeptide 
of the invention and is not one which encodes a polypeptide whose sequence is 
given in SEQ ID No 120 of WO 99/06423 or which is encoded by SEQ ID No 
15 32 or by the nucleic acid included in the microbiological deposit corresponding 
to American Type Culture Collection (ATCC) No. 209145 made on 17 July 
1997, 

ATCC deposit No 209145 comprises a pSportl vector which includes a 765 
20 base nucleotide sequence. 

The polynucleotide sequence given in SEQ ID No 32 of WO 99/06423 is 
similar to the nucleotide sequence shown in Figure 2. The sequence of SEQ ID 
No 32 given in WO 99/06423 may be capable of encoding part of the ECSM1 
25 polypeptide of the invention. Due to degeneracy of the genetic code however, 
a polynucleotide sequence may encode the ECSM1 polypeptide of the 
invention without having a nucleotide sequence as given in WO 99/06423. In a 
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similar manner, a polynucleotide sequence may encode the (full length) 
ECSM4 polypeptide of the invention without having the same sequence as that 
given in Figure 4 or Figure 5 or Figure 12. Such polynucleotides are within the 
scope of this invention. 

5 

Hence, it will he appreciated that a polynucleotide of the thirteenth aspect of 
the invention is preferably not one whose nucleotide sequence is given in 
Figure 4, and that a polynucleotide of the fourteenth aspect of the invention is 
preferably not a polynucleotide which is disclosed in WO 99/06423, such as 
10 SEQ ID No 32 disclosed therein or its complement or variants or the 
corresponding cDNA sequence deposited under Accession No 209145 at the 
ATCC or a polynucleotide fragment capable of encoding a polypeptide whose 
amino acid sequence comprises the sequence given in SEQ ID No 120 of WO 
99/06423. 

15 

A polynucleotide of the thirteenth or fourteenth aspects of the invention may 
encode a variant of the ECSM4 or ECSM1 polypeptide as described above. In 
addition, the insertions and/or deletions within the ECSM4 or ECSM1 
polypeptide may lead to frameshift mutations which may encode truncated (or 
20 elongated) polypeptide products, and insertions, deletions or other mutations 
may lead to the introduction of stop codons which encode truncate polypeptide 
products. 

The polynucleotide of the invention may be DNA or RNA. It is preferred if it 
25 is DNA. 
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The polynucleotide may or may not contain introns. It is preferred if it does 
not contain introns. 

The polynucleotide may be single stranded or double stranded or a mixture of 
5 either. 

The polynucleotide of the invention has at least 10 nucleotides, and preferably 
at least 15 nucleotides and more preferably at least 30 nucleotides. In a further 
preference, the polynucleotide is more than 50 nucleotides, more preferably at 
10 least 100 nucleotides, and still more preferably the polynucleotide is at least 
500 nucleotides. The polynucleotide may be more than lkb, and may comprise 
more than 5kb. 

The invention also includes a polynucleotide which is able to selectively 
15 hybridise to a polynucleotide which encodes the ECSM4 or ECSM1 
polypeptide or a fragment or variant or fusion thereof, or a fusion of said 
variant or fragment. Preferably, said polynucleotide is at least 10 nucleotides, 
more preferably at least 15 nucleotides and still more preferably at least 30 
nucleotides in length. The said polynucleotide may be longer than 100 
20 nucleotides and may be longer than 200 nucleotides, but preferably the said 
polynucleotide is not longer than 250 nucleotides. Such polynucleotides are 
useful in procedures as a detection tool to demonstrate the presence of the 
polynucleotide in a sample. Such a sample may be a sample of DNA, such as a 
bacterial colony, fixed on a membrane or filter. 

25 

Preferably, the polynucleotide which is capable of selectively hybridising as 
said is not any one of the nucleotide sequences represented by SEQ ID No 
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18084 or 5096 of EP 1 074 617, SEQ ID No 210 of WO 00 53756 or WO 
99/46281, or SEQ ID Nos 22, 23, 96 or 98 of WO 01/23523 or SEQ ID No 31 
of WO 99/11293. 

By "selectively hybridise" we mean that the polynucleotide hybridises under 
conditions of high stringency. DNA-DNA, DNA-RNA and RNA-RNA 
hybridisation may be performed in aqueous solution containing between 0.1X 
SSC and 6X SSC and at temperatures of between 55°C and 70°C. It is well 
known in the art that the higher the temperature or the lower the SSC 
concentration the more stringent the hybridisation conditions. By "high 
stringency" we mean 2X SSC and 65°C. IX SSC is 0.1 5M NaCl/0.015M 
sodium citrate. Polynucleotides which hybridise at high stringency are 
included within the scope of the claimed invention. 

In another embodiment, the polynucleotide can be used as a primer in the 
polymerase chain reaction (PCR), and in this capacity a polynucleotide of 
between 15 and 30 nucleotides is preferred. A polynucleotide of between 20 
and 100 nucleotides is preferred when the fragment is to be used as a 
mutagenic PCR primer. It is particularly preferred if the PCR primer (when 
not being used to mutate a nucleic acid) contains about 15 to 30 contiguous 
nucleotides (ie perfect matches) from the nucleotide sequence given in Figure 4 
or Figure 7 or Figure 12 or Figure 13 from the nucleotide sequence given in 
Figure 2. Clearly, if the PCR primers are used for mutagenesis, differences 
compared to the sequence will be present 
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Primers which are suitable for use in a polymerase chain reaction (PCR; Saiki 
et al (1988) Science 239, 487-491) are preferred. Suitable PCR primers may 
have the following properties: 

5 It is well known that the sequence at the 5' end of the oligonucleotide need not 
match the target sequence to be amplified. 

It is usual that the PCR primers do not contain any complementary structures 
with each other longer than 2 bases, especially at their 3' ends, as this feature 
10 may promote the formation of an artifactual product called "primer dimer". 
When the 3' ends of the two primers hybridize, they form a "primed template" 
complex, and primer extension results in a short duplex product called "primer 
dimer". 

15 Internal secondary structure should be avoided in primers. For symmetric 
PCR, a 40-60% G+C content is often recommended for both primers, with no 
long stretches of any one base. The classical melting temperature calculations 
used in conjunction with DNA probe hybridization studies often predict that a 
given primer should anneal at a specific temperature or that the 72°C extension 

20 temperature will dissociate the primer/template hybrid prematurely. In 
practice, the hybrids are more effective in the PCR process than generally 
predicted by simple T m calculations. 

Optimum annealing temperatures may be determined empirically and may be 
25 higher than predicted. Tag DNA polymerase does have activity in the 37-55°C 
region, so primer extension will occur during the annealing step and the hybrid 
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will be stabilised. The concentrations of the primers are equal in conventional 
(symmetric) PCR and, typically, within 0.1- to InM range. 

When a pair of suitable nucleic acids of the invention are used in a PCR it is 
convenient to detect the product by gel electrophoresis and ethidium bromide 
staining. As an alternative to detecting the product of DNA amplification 
using agarose gel electrophoresis and ethidium bromide staining of the DNA, it 
is convenient to use a labelled oligonucleotide capable of hybridising to the 
amplified DNA as a probe. When the amplification is by a PCR the 
oligonucleotide probe hybridises to the interprimer sequence as defined by the 
two primers. The probe may be labelled with a radionuclide such as 32 P 3 33 P 
and 35 S using standard techniques, or may be labelled with a fluorescent dye. 
When the oligonucleotide probe is fluorescently labelled, the amplified DNA 
product may be detected in solution (see for example Balaguer et al (1991) 
"Quantification of DNA sequences obtained by polymerase chain reaction 
using a bioluminescence adsorbent" Anal Biochem. 195, 105-110 and Dilesare 
et al (1993) "A high-sensitivity electrochemiluminescence-based detection 
system for automated PCR product quantitation" BioTechniques 15, 152-157. 

PCR products can also be detected using a probe which may have a 
fluorophore-quencher pair or may be attached to a solid support or may have a 
biotin tag or they may be detected using a combination of a capture probe and a 
detector probe. 

Fluorophore-quencher pairs are particularly suited to quantitative 
measurements of PCR reactions (eg RT-PCR). Fluorescence polarisation using 
a suitable probe may also be used to detect PCR products. 
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Oligonucleotide primers can be synthesised using methods well known in the 
art, for example using solid-phase phosphoramidite chemistry. 

5 A polynucleotide or oligonucleotide primer of the invention may contain one 
or more modified bases or may contain a backbone which has been modified 
for stability purposes or for other reasons. By modified we included for 
example, tritylated bases and unusual bases such as inosine. A variety of 
modifications can be made to DNA and RNA and these are included in the 

10 scope of the invention. 

In a preferred embodiment, the polynucleotides of the invention are detectably 
labelled. Suitable detectable labels are described in detail above. 

15 A fifteenth aspect of the invention provides an expression vector comprising a 
polynucleotide as described above. Typically, the polynucleotides are those 
which encode the polypeptides ECSM1 or ECSM4 or a fragment, variant or 
fusion thereof. 

20 By "expression vector" we mean one which is capable, in an appropriate host, 
of expressing a polypeptide encoded by the polynucleotide. 

Such vectors may be useful in expressing the encoded polypeptide in a host 
cell for production of useful quantities of the polypeptide, or may be useful in 
25 medicine. Expression vectors comprising a polynucleotide according to the 
thirteenth or fourteenth aspects of the invention which are suitable for use in 
gene therapy axe within the scope of the invention. Administration of a gene 
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therapy vector capable of expressing the ECSM4 polypeptide may be useful in 
modulating or inhibiting angiogenesis, since this polypeptide is likely to be a 
repulsive guidance receptor. Similarly, gene therapy vectors capable of 
expressing fragments or mutants of ECSM4 on the cell surface, which 
5 fragments or mutants are capable of binding the ECSM4 cognate ligand but are 
not able to convey the normal downstream signal (for example, because the 
necessary cytosolic portion of the polypeptide is deleted or mutated so as to not 
be functional or capable of binding normally interacting cellular proteins) may 
also be useful in modulating angiogenesis in an individual. 

10 

Hence, in a preferred embodiment, the vector is one which is suitable for use in 
gene therapy. Examples of suitable vectors and methods of their introduction 
into cells are given in more detail below. In particular, the gene therapy 
methods and vectors described in relation to the use of promoters of ECSM4 
15 may also be used in relation to the use of ECSM4 coding sequences or 
antisense in gene therapy. 

It will be appreciated that the polynucleotide comprised within the expression 
vector of this aspect of the invention may be one which encodes the 

20 polypeptide ECSM4 or ECSM1 or a fragment or variant thereof, or the 
polynucleotide may be one which is capable of selectively hybridising to the 
ECSM4 or ECSM1 coding region. Polynucleotides which are capable of 
hybridising to the ECSM4 or ECSM1 coding region are useful as antisense 
polynucleotides which may decrease the expression level of ECSM4 or 

25 ECSM1 within a target cell. The design of suitable and effective antisense 
polynucleotides based on a known coding sequence is known in the art of gene 
therapy. 
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Preferably, the expression vector of this aspect of the invention is one which 
does not contain a polynucleotide sequence represented by any one of SEQ ID 
No 18085 of EP 1 074 617, SEQ ID No 211 of either WO 00/53756 or 
5 W099/46281, SEQ ID Nos 24-27, -29, 30, 33, 34, 38 or 39 of 
WO 01/23523, or SEQ ID No 86 of WO 99/11293 or their complement. 
Also preferably, the said vector is one which does not contain a 
polynucleotide encoding a polypeptide whose amino acid sequence is 
represented by any one of SEQ ID No 18085 of EP 1 074 617, SEQ ID No 
10 211 of either WO 00/53756 or W099/46281, SEQ ID Nos 24-27, 29, 30, 
33, 34, 38 or 39 of WO 01/23523, or SEQ ID No 86 of WO 99/11293. 

Both the amount of therapeutic protein or therapeutic polynucleotide produced 
and the duration of production are important issues in gene therapy. 
15 Consequently, the use of viral vectors capable of cellular gene integration (eg 
retroviral vectors) may be more beneficial than non-integrating alternatives (eg 
adenovirus derived vectors) when repeated therapy is undesirable for 
immunogenicity reasons. 

20 By "therapeutic polynucleotide" or "therapeutic protein" we include ECSM4 
and ECSM1 coding sequences, the polypeptide product encoded by said coding 
sequences, and ECSM4 antisense polynucleotides. The therapeutic effect of 
said polynucleotides or proteins may include pro-angiogenic or anti-angiogenic 
effects, depending on the precise therapeutic agent administered. For example, 

25 an expression vector suitable for gene therapy which comprises a 
polynucleotide which is antisense to at least part of the ECSM4 coding region 
may have anti-angiogenic activity when expressed in a host cell or patient if it 
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suppresses expression of a molecule which is required for angiogenesis. If the 
polynucleotide comprised within the expression vector encodes a polypeptide 
which is required for inhibition of angiogenesis (for example, because said 
polypeptide has endothelial cell repulsive guidance activity), then expression of 
5 the antisense may also be anti-angiogenic. 

Conversely, if said the expression vector comprises a polynucleotide of the 
invention which polynucleotide suppresses expression of a molecule whose 
activity is required to decrease vascular growth (for example, because said 
10 molecule is an endothelial cell repulsive guidance molecule) or encodes a 
polypeptide whose activity is required for angiogenesis, administration of the 
said vector may be pro-angiogenic. 

Where the therapeutic gene is maintained extrachromosomally, the highest 
15 level of expression is likely to be achieved using viral promoters, for example, 
the Rous sarcoma virus long terminal repeat (Ragot et al (1993) Nature 361, 
647-650; Hyde et al (1993) Nature 362, 250-255) and the adenovirus major 
late promoter. The latter has been used successfully to drive the expression of 
a cystic fibrosis transmembrane conductance regulator (CFTR) gene in lung 
20 epithelium (Rosenfeld et al (1992) Cell 68, 143-155). Since these promoters 
function in a broad range of tissues they may not be suitable to direct cell-type- 
specific expression unless the delivery method can be adapted to provide the 
specificity. However, somatic enhancer sequences could be used to give cell- 
type-specific expression in an extrachromosomal setting. 

25 

As described in more detail below, the ECSM4 regulatory/promoter region is 
an example of a regulatory region capable of conferring endothelial cell 



WO 02/36771 




PCT/GB01/04906 



58 

selective expression, preferably selective to endothelial cells of neovasculature 
(ie, angiogenic endothelial cells) on an operatively linked coding region. As 
outlined above, such a coding region may encode an antisense polynucleotide. 

Where withdrawal of the gene-vector construct is not possible, it may be 
necessary to add a suicide gene to the system to abort toxic reactions rapidly. 
The herpes simplex virus thymidine kinase gene, when transduced into cells, 
renders them sensitive to the drug ganciclovir, creating the option of killing the 
cells quickly. 

The use of ectotropic viruses, which are species specific, may provide a safer 
alternative to the use of amphotropic viruses as vectors in gene therapy. In this 
approach, a human homologue of the non-human, ectotropic viral receptor is 
modified in such a way so as to allow recognition by the virus. The modified 
receptor is then delivered to cells by constructing a molecule, the front end of 
which is specified for the targeted cells and the tail part being the altered 
receptor. Following delivery of the receptor to its target, the genetically 
engineered ectotropic virus, carrying the therapeutic gene, can be injected and 
will only integrate into the targeted cells. 

Virus-derived gene transfer vectors can be adapted to recognise only specific 
cells so it may be possible to target to an endothelial cell, such as endothelial 
cells within a tumour. Similarly, it is possible to target expression of an 
therapeutic gene to the endothelial cell, using an endothelial cell-specific 
promoter such as that for the ECSM4 or ECSM1 genes. 
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One of the ECSM genes or a part of the genes or a polynucleotide comprising 
an antisense to the gene may be introduced into the cell in a vector such that 
the gene remains extrachromosomal. In such a situation, the gene will be 
expressed by the cell from the extrachromosomal location. Vectors for 
introduction of genes both for recombination and for extrachromosomal 
maintenance are known in the art, and any suitable vector may be used. 
Methods for introducing DNA into cells such as electroporation, calcium 
phosphate co-precipitation and viral transduction are known in the art, and the 
choice of method is within the competence of the ordinary skilled person. 
Cells transformed with the wild-type novel gene can be used as model systems 
to study cancer remission and drug treatments which promote such remission. 

A variety of methods have been developed to operably link polynucleotides, 
especially DNA, to vectors, for example, via complementary cohesive termini. 
For instance, complementary homopolymer tracts can be added to the DNA 
segment to be inserted into the vector DNA. The vector and DNA segment are 
then joined by hydrogen bonding between the complementary homopolymeric 
tails to form recombinant DNA molecules. 

Synthetic linkers containing one or more restriction sites provide an alternative 
method of joining the DNA segment to vectors. The DNA segment, generated 
by endonuclease restriction digestion as described earlier, is treated with 
bacteriophage T4 DNA polymerase oxE.coli DNA polymerase I, enzymes that 
remove protruding, 3 '-single-stranded termini with their 3'-5'-exonucleolytic 
activities, and fill in recessed 3 '-ends with their polymerising activities. 
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The combination of these activities therefore generates blunt-ended DNA 
segments. The blunt-ended segments are then incubated with a larger molar 
excess of linker molecules in the presence of an enzyme that is able to catalyse 
the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA 
ligase. Thus, the products of the reaction are DNA segments carrying 
polymeric linker sequences at their ends. These DNA segments are then 
cleaved with the appropriate restriction enzyme and ligated to an expression 
vector that has been cleaved with an enzyme that produces termini compatible 
with those of the DNA segment. 

Synthetic linkers containing a variety of restriction endonuclease site are 
commercially available from a number of sources including International 
Biotechnologies Inc., New Haven, CN, USA. 

A desirable way to modify the DNA encoding the polypeptide of the invention 
is to use PCR. This method may be used for introducing the DNA into a 
suitable vector, for example by engineering in suitable restriction sites, or it 
may be used to modify the DNA in other useful wasy as is known in the art. 

In this method the DNA to be enzymatically amplified is flanked by two 
specific primers which themselves become incorporated into the amplified 
DNA. The said specific primers may contain restriction endonuclease 
recognition sites which can be used for cloning into expression vectors using 
methods known in the art. 

The DNA (or in the case of retroviral vectors, RNA) is then expressed in a 
suitable host to produce a polypeptide comprising the polypeptide of the 
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invention. Thus, the DNA encoding the polypeptide constituting the 
polypeptide of the invention may be used in accordance with known 
techniques, appropriately modified in view of the teachings contained herein, 
to construct an expression vector, which is then used to transform an 
5 appropriate host cell for the expression and production of the polypeptide of 
the invention. Such techniques include those disclosed in US Patent Nos. 
4,440,859 issued 3 April 1984 to Rutter et al, 4,530,901 issued 23 July 1985 to 
Weissman, 4,582,800 issued 15 April 1986 to Crowl, 4,677,063 issued 30 June 
1987 to Mark et al, 4,678,751 issued 7 July 1987 to Goeddel, 4,704,362 issued 
10 3 November 1987 to Itakura et al, 4,710,463 issued 1 December 1987 to 
Murray, 4,757,006 issued 12 July 1988 to Toole, Jr. et al, 4,766,075 issued 23 
August 1988 to Goeddel et al and 4,810,648 issued 7 March 1989 to Stalker, 
all of which are incorporated herein by reference. 

15 The DNA (or in the case or retroviral vectors, RNA) encoding the polypeptide 
constituting the polypeptide of the invention may be joined to a wide variety of 
other DNA sequences for introduction into an appropriate host. The companion 
DNA will depend upon the nature of the host, the manner of the introduction of 
the DNA into the host, and whether episomal maintenance or integration is 

20 desired. 

Generally, the DNA is inserted into an expression vector, such as a plasmid, in 
proper orientation and correct reading frame for expression. If necessary, the 
DNA may be linked to the appropriate transcriptional and translational 
25 regulatory control nucleotide sequences recognised by the desired host, 
although such controls are generally available in the expression vector. The 
vector is then introduced into the host through standard techniques. Generally, 
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not all of the hosts will be transformed by the vector. Therefore, it will be 
necessary to select for transformed host cells. One selection technique 
involves incorporating into the expression vector a DNA sequence, with any 
necessary control elements, that codes for a selectable trait in the transformed 
5 cell,, such as antibiotic resistance. Alternatively, the gene for such selectable 
trait can be on another vector, which is used to co-transform the desired host 
cell. 

Host cells that have been transformed by the expression vector of the invention 
10 are then cultured for a sufficient time and under appropriate conditions known 
to those skilled in the art in view of the teachings disclosed herein to permit the 
expression of the polypeptide, which can then be recovered. 

Many expression systems are known, including bacteria (for example, E.coli 
15 and Bacillus subtilis), yeasts (for example Saccharomyces cerevisiae), 
filamentous fungi (for example Aspergillus), plant cells, animal cells and insect 
cells. 

The vectors typically include a prokaryotic replicon, such as the ColEl ori, for 
20 propagation in a prokaryote, even if the vector is to be used for expression in 
other, non-prokaryotic, cell types. The vectors can also include an appropriate 
promoter such as a prokaryotic promoter capable of directing the expression 
(transcription and translation) of the genes in a bacterial host cell, such as 
E.coli, transformed therewith. 

25 

A promoter is an expression control element formed by a DNA sequence that 
permits binding of RNA polymerase and transcription to occur. Promoter 
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sequences compatible with exemplary bacterial hosts are typically provided in 
plasmid vectors containing convenient restriction sites for insertion of a DNA 
segment of the present invention. 

Typical prokaryotic vector plasmids are pUC18, pUC19, pBR322 and pBR329 
available from Biorad Laboratories, (Richmond, CA, USA) and pTrc99A and 
pKK223-3 available from Pharmacia, Piscataway, NJ, USA. 

A typical mammalian cell vector plasmid is pSVL available from Pharmacia, 
Piscataway, NJ, USA. This vector uses the SV40 late promoter to drive 
expression of cloned genes, the highest level of expression being found in T 
antigen-producing cells, such as COS-1 cells. 

An example of an inducible mammalian expression vector is pMSG, also 
available from Pharmacia. This vector uses the glucocorticoid-inducible 
promoter of the mouse mammary tumour virus long terminal repeat to drive 
expression of the cloned gene. 

Useful yeast plasmid vectors are pRS403-406 and pRS41 3-416 and are 
generally available from Stratagene Cloning Systems, La Jolla, CA 92037, 
USA. Plasmids pRS403, pRS404, pRS405 and pRS406 are Yeast Integrating 
plasmids (Yips) and incorporate the yeast selectable markers HIS3, TKP1, 
LEU2 and URA3. Plasmids pRS413-416 are Yeast Centromere plasmids 
(Ycps). 

Other vectors and expression systems are well known in the art for use with a 
variety of host cells. 
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A sixteenth aspect of the invention provides a recombinant host cell 
comprising a polynucleotide or vector of the invention. 

5 The polynucleotide of the invention includes polynucleotides encoding a 
compound of the third aspect of the invention (where both the moiety which 
selectively binds and the further moiety are polypeptides which are fused) or an 
ECSM4 or ECSM1 polypeptide of the invention or a fragment or fusion or 
variant thereof as defined above. 

10 

The host cell can be either prokaryotic or eukaryotic. Bacterial cells are 
preferred prokaryotic host cells and typically are a strain of E.coli such as, for 
example, the E.coli strains DH5 available from Bethesda Research 
Laboratories Inc., Bethesda, MD, USA, and RR1 available from the American 

15 Type Culture Collection (ATCC) of Rockville, MD, USA (No. ATCC 31343). 
Preferred eukaryotic host cells include yeast, insect and mammalian cells, 
preferably vertebrate cells such as those from a mouse, rat, monkey or human 
fibroblastic and kidney cell lines. Yeast host cells include YPH499, YPH500 
and YPH501 which are generally available from Stratagene Cloning Systems, 

20 La Jolla, CA 92037, USA. Preferred mammalian host cells include Chinese 
hamster ovary (CHO) cells available from the ATCC as CRL 1658 and 293 
cells which are human embryonic kidney cells. Preferred insect cells are Sf9 
cells which can be transfected with baculovirus expression vectors. 

25 Transformation of appropriate cell hosts with a DNA construct of the present 
invention is accomplished by well known methods that typically depend on the 
type of vector used. With regard to transformation of prokaryotic host cells, 
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see, for example, Cohen et al (1972) Proc. Natl. Acad. Sci. USA 69, 2110 and 
Sambrook et al (1989) Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY. Transformation of yeast cells is 
described in Sherman et al (1986) Methods In Yeast Genetics, A Laboratory 
Manual, Cold Spring Harbor, NY. The method of Beggs (1978) Nature 275, 
104-109 is also useful. With regard to vertebrate cells, reagents useful in 
transfecting such cells, for example calcium phosphate and DEAE-dextran or 
liposome formulations, are available from Stratagene Cloning Systems, or Life 
Technologies Inc., Gaithersburg, MD 20877, USA. 

Electroporation is also useful for transforming and/or transfecting cells and is 
well known in the art for transforming yeast cells, bacterial cells, insect cells 
and vertebrate cells. 

For example, many bacterial species may be transformed by the methods 
described in Luchansky et al (1988) Mol. Microbiol 2, 637-646 incorporated 
herein by reference. The greatest number of transformants is consistently 
recovered following electroporation of the DNA-cell mixture suspended in 2.5 
PEB using 6250V per cm at 25 uFD. 

Methods for transformation of yeast by electroporation are disclosed in Becker 
& Guarente (1 990) Methods Enzymol. 194, 1 82. 

Successfully transformed cells, ie cells that contain a DNA construct of the 
present invention, can be identified by well-known techniques. For example, 
cells resulting from the introduction of an expression construct of the present 
invention can be grown to produce the polypeptide of the invention. Cells can 
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be harvested and lysed and their DNA content examined for the presence of the 
DNA using a method such as that described by Southern (1975) Mol Biol 
98, 503 or Berent et al (1985) Biotech 3, 208. Alternatively, the presence of 
the protein in the supernatant can be detected using antibodies as described 
5 below. 

In addition to directly assaying for the presence of recombinant DNA, 
successful transformation can be confirmed by well known immunological 
methods when the recombinant DNA is capable of directing the expression of 
10 the protein. For example, cells successfully transformed with an expression 
vector produce proteins displaying appropriate antigenicity. 

Samples of cells suspected of being transformed are harvested and assayed for 
the protein using suitable antibodies. 

15 

The host cell may be a host cell within an animal body. Thus, transgenic 
animals which express a polypeptide of the first or third aspects of the 
invention by virtue of the presence of the transgene are included. Preferably, 
the transgenic animal is a rodent such as a mouse. Transgenic animals can be 
20 made using methods well known in the art. 

Polynucleotides encoding the polypeptide ECSM4 may be useful in generating 
transgenic non-human mammals wherein the ECSM4 is mutated in some way. 
For example, the mouse ECSM4 genomic coding region may be mutated in a 
25 mouse so as to produce an ECSM4 polypeptide which is incapable of binding 
its natural ligand, or incapable of correctly interacting with intracellular 
components. Such a mutated ECSM4 polypeptide may produce a disease in 
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the mouse which is very similar to a disease involving abnormal 
vascularisation in humans. 

Hence, non-human mammals, especially rodents such as mice and rats, are 
5 useful as models of diseases involving abnormal vascularisation. 

Alternatively, mammals lacking the ECSM4 gene ("knock-outs") or lacking an 
ECSM4 genomic coding region which is capable of being transcribed or of 
expressing the ECSM4 polypeptide, may be useful in providing a means of 
10 generating antibodies selective for the human ECSM4 polypeptide. Such 
mammals, especially mice, are likely to be particularly useful since the high 
level of homology between the human and mouse ECSM4 polypeptides may 
prevent human ECSM4 polypeptide from being antigenic in mice who do 
express the ECSM4 polypeptide. 

15 

A potentially more accurate animal model of diseases involving abnormal 
vascularisation may be made by addition to the genome of a transgenic animal 
as described above, or replacing the genomic ECSM4 of an animal with, the 
gene for human ECSM4 which has been mutated. Suitably, the human ECSM4 

20 inserted will be under control of an endothelial selective promoter and 
regulatory region. Preferably, the promoter and regulatory regions are those of 
the host animal ECSM4 gene. An animal who genome is modified in this way 
will express the dysfunctional human ECSM4, and therefore will be useful in 
testing the efficacy of drugs and antibodies in the diagnosis, prognosis and 

25 treatment of diseases involving abnormal vascularisation in humans. 
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Such knockout or transgenic mammals are within the scope of the invention 
and antibodies generated using such mammals and compounds comprising 
them are also included within the scope of the invention. 

5 A seventeenth aspect of the invention provides a method of producing a 
polypeptide of the invention, the method comprising expressing a 
polynucleotide as described above or culturing a host cell as described herein. 

It will be appreciated that in order to produce the ECSM1 polypeptide, the host 
10 cell may comprise a polynucleotide encoding a polypeptide whose amino acid 
sequence includes the sequence given in Figure 2, and that in order to produce 
the ECSM4 polypeptide the host cell may comprise a polynucleotide encoding 
the polypeptide whose amino acid sequence is given in Figure 4 or Figure 7 or 
Figure 12 and so on. 

15 

Preferably, the polynucleotide expressed does not consist of any one of the 
nucleotide sequences represented by SEQ ID No 18084 or 5096 of EP 1 074 
617, SEQ ID No 210 of WO 00/53756 or WO 99/46281, or SEQ ID Nos 22, 
23, 96 or 98 of WO 01/23523 and SEQ ID No 31 of WO 99/11293. 

20 

Also preferably, the polypeptide produced is not one with an amino acid 
sequence consisting of the sequence represented by any one of SEQ ID No 
18085 of EP 1 074 617, SEQ ID No 211 of either WO 00/53756 or 
W099/46281, SEQ ID Nos 24-27, 29, 30, 33, 34, 38 or 39 of WO 01/23523, or 
25 SEQ ID No 86 of WO 99/1 1293. 
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Methods of cultivating host cells and isolating recombinant proteins are well 
known in the art. It will be appreciated that, depending on the host cell, the 
ECSM1 or ECSM4 polypeptides produced may differ from that which can be 
isolated from nature. For example, certain host cells, such as yeast or bacterial 
cells, either do not have, or have different, post-translational modification 
systems which may result in the production of forms of ECSM1 or ECSM4 
which may be post-translationally modified in a different way to ECSM1 or 
ECSM4 isolated from nature. In order to obtain ECSM1 or ECSM4 which is 
post-translationally modified in a different way to human ECSM1 or ECSM4 it 
is preferred if the host cell is a non-human host cell; more preferably it is not a 
mammalian cell. 

It is preferred that the ECSM1 or ECSM4 polypeptide is produced in a 
eukaryotic system, such as an insect cell. 

According to a less preferred embodiment, the ECSM1 or ECSM4 polypeptide 
can be produced in vitro using a commercially available in vitro translation 
system, such as rabbit reticulocyte lysate or wheatgerm lysate (available from 
Promega). Preferably, the translation system is rabbit reticulocyte lysate. 
Conveniently, the translation system may be coupled to a transcription system, 
such as the TNT transcription-translation system (Promega). This system has 
the advantage of producing suitable mRNA transcript from an encoding DNA 
polynucleotide in the same reaction as the translation. Conveniently, where the 
expressed polypeptide comprises one or more transmembrane domains, the 
translation system can be supplemented with a source of endoplasmic 
reticulum-derived membranes and folding chaperones, such as dog pancreatic 
microsomes, to allow synthesis of the polypeptide in a native conformation. 
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Preferably, the production method of this aspect of the invention comprises a 
further step of isolating the ECSM1 or ECSM4 produced from the host cell or 
from the in vitro translation mix. Preferably, the isolation employs an antibody 
5 which selectively binds the expressed polypeptide of the invention. 

It will be understood that the invention comprises the ECSM1 or ECSM4 
polypeptides or the variants or fragments or fusions thereof, or a fusion of said 
variants or fragments obtainable by the methods herein disclosed, provided that 

10 the ECSM4 polypeptide is not one which consists of the amino acid sequence 
given in Figure 4. Preferably, the polypeptide is not one which consists of an 
amino acid sequence represented by any one of SEQ ID No 18085 of EP 1 074 
617, SEQ ID No 211 of either WO 00/53756 or W099/46281, SEQ ID Nos 
24-27, 29, 30, 33, 34, 38 or 39 of WO 01/23523, or SEQ ID No 86 of 

15 WO 99/11293. Preferably, the ECSM1 polypeptide produced by the methods 
herein disclosed is not one which is encoded by SEQ ID No 32 of WO 
99/06423 or encoded by the nucleic acid of ATCC deposit No.209145 made on 
July 1 7 1 997 for the purposes of WO 99/06423 . 

20 An eighteenth aspect of the invention provides an antibody capable of 
selectively binding to either ECSM4 or ECSM1 as defined above. 

Preferably, an antibody which selectively binds ECSM1 is not one which binds 
a polypeptide encoded by SEQ ID No 32 of WO 99/06423 or encoded by the 
25 nucleic acid of ATCC deposit No 209145 made on July 17 1997 for the 
purposes of the international patent application PCT7US98/15949. 
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Preferably, an antibody which selectively binds ECSM1 is one which binds a 
polypeptide whose amino acid sequence comprises the sequence given in 
Figure 2 or a natural variant thereof but does not comprise the amino acid 
sequence encoded by ATCC deposit No 209145 made on July 17 1997. 

5 

Preferably, an antibody which selectively binds ECSM4 is one which binds a 
polypeptide whose amino acid sequence comprises the sequence given in any 
one of Figures 4, 5, 7, 12 or 13 or a natural variant thereof but does not bind 
the polypeptide represented by any one of SEQ ED No 18085 of EP 1 074 617, 

10 SEQ ID No 21 1 of either WO 00/53756 or W099/46281, SEQ ED Nos 24-27, 
29, 30, 33, 34, 38 or 39 of WO 01/23523, or SEQ ED No 86 of WO 99/11293, 
or encoded by any one of the nucleotide sequences represented by SEQ ID No 
18084 or 5096 of EP 1 074 617, SEQ ID No 210 of WO 00/53756 or 
WO 99/46281, or SEQ ID Nos 22, 23, 96 or 98 of WO 01/23523 and SEQ ID 

15 No 31 of WO 99/1 1293. 

By "selectively bind" we include antibodies which bind at least 10-fold more 
strongly to a polypeptide of the invention (such as ECSM4 or ECSM1) than to 
another polypeptide; preferably at least 50-fold more strongly and more 
20 preferably at least 100-fold more strongly. Such antibodies may be made by 
methods well known in the art using the information concerning the differences 
in amino acid sequence of ECSM4 or ECSM1 and another polypeptide which 
is not a polypeptide of the invention. 

25 Antibodies which selectively bind ECSM4 may also modulate the function of 
the ECSM4 polypeptide. Antibodies which mimic the effect of binding of the 
cognate ligand by stimulating or activating ECSM4, or which bind and thereby 
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prevent subsequent binding and activation or stimulation of ECSM4 by the 
cognate ligand, and such function-modulating antibodies are included in the 
scope of the invention. It will be appreciated that antibodies which modulate 
the function are useful as a tool in research, for example in studying the effects 
5 of ECSM4 stimulation or activation, or downstream processes triggered by 
such stimulation. Such antibodies are also useful in medicine, for example in 
modulating angiogenesis in an individual. Specifically, modulation of 
angiogenesis by administration of such an antibody may be useful in the 
treatment of a disease in an individual where modulation of angiogenesis 
10 would be beneficial, such as cancer. 

The following peptides may be useful as immunogens in the generation of 
antibodies, such as rabbit polyclonal sera: LSQSPGAVPQALVAWRA, 
DSVLTPEEVALCLEL, TYGYISVPTA and KGGVLLCPPRPCLTPT. 

15 

In a preferred embodiment of this aspect, the antibody of the. invention 
selectively binds an amino acid sequence with the sequence 
GGDSLLGGRGSL, LLQPPARGHAHDGQALSTDL, EPQDYTEPVE, 
TAPGGQGAPWAEE or ERATQEPSEHGP. These sequences represent 

20 amino acid sequences which are not identical between the human and mouse 
ECSM4 polypeptide sequences. Generally, the human and mouse ECSM4 
polypeptides display a high degree of identity, which makes the production of 
mouse antibodies to the human ECSM4 particularly difficult due to the lack of 
immunogenicity of much of the human ECSM4 sequence in mouse. Amino 

25 acid sequences which are absent from the mouse ECSM4 are more likely to 
more be immunogenic in a mouse than those sequences which are present in 
the mouse ECSM4 (an alignment of the human and mouse ECSM4 amino acid 
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sequences is shown in Figure 14). Hence, polypeptide fragments which 
contain sequences which are unique to human ECSM4 as described above are 
more useful than ECSM4 polypeptides whose sequence is found in both human 
and mouse ECSM4, in the production of antibodies which selectively bind the 
5 human ECSM4 polypeptide. 

Antibodies generated as a result of use of amino acid sequences which are 
located in the extracellular portion of the ECSM4 polypeptide are likely to be 
useful as endothelial cell targeting molecules. Therefore, it is particularly 

10 preferred if the antibody of the invention is raised to, and preferably selectively 
binds, an amino acid sequence which is unique to the human ECSM4 
polypeptide, which sequence is located towards the N-terminal end of the 
polypeptide and is found in the extracellular portion located between residues 1 
and 467 of the amino acid sequence given in Figure 12. An example of an 

15 amino acid sequence which is suitable for raising antibody molecules selective 
for the ECSM4 extracellular region is given in Figure 12. 

Although the amino acid sequences which are unique to the human ECSM4 
may be used to produce polyclonal antibodies, it is preferred if they are used to 
20 produce monoclonal antibodies. 

Peptides in which one or more of the amino acid residues are chemically 
modified, before or after the peptide is synthesised, may be used providing that 
the function of the peptide, namely the production of specific antibodies in 
25 vivo, remains substantially unchanged. Such modifications included forming 
salts with acids or bases, especially physiologically acceptable organic or in 
organic acids and bases, forming an ester or amid of a terminal carboxyl group, 



WO 02/36771 




PCT/GB01/04906 



74 

and attaching amino acid protecting groups such as N-t-butoxycarbonyl. Such 
modifications may protect the peptide from in vivo metabolism. The peptides 
may be present as single copies or as multiples, for example tandem repeats. 
Such tandem or multiple repeats may be sufficiently antigenic themselves to 
5 obviate the use of a carrier. It may be advantageous for the peptide to be 
formed as a loop, with the N-terminal and C-terminal ends joined together, or 
to add one or more Cys residues to an end to increase antigenicity and/or to 
allow disulphide bonds to be formed. If the peptide is covalently linked to a 
carrier, preferably a polypeptide, then the arrangement is preferably such that 
10 the peptide of the invention forms a loop. 

According to current immunological theories, a carrier function should be 
present in any immunogenic formulation in order to stimulate, or enhance 
stimulation of, the immune system. It is though that the best earners embody 

15 (or, together with the antigen, create) a T-cell epitope. The peptides may be 
associated, for example by cross-linking, with a separate carrier, such as serum 
albumins, myoglobins, bacterial toxoids and keyhole limpit haemocyanin. 
More recently developed carriers which induce T-cell help in the immune 
response include the hepatitis-B core antigen (also called the nucleocapsid 

20 protein), presumed T-cell epitopes such as Thr-Ala-Ser-Gly-Val-Ala-Glu-Thr- 
Thr-Asn-Cys, p-galactosidase and the 163-171 peptide of interleukin-1. The 
latter compound may variously be regarded as a carrier or as an adjuvant or as 
both. Alternatively, several copies of the same or different peptides of the 
invention may be cross-linked to one another; in this situation there is no 

25 separate carrier as such, but a carrier function may be provided by such cross- 
linking. Suitably cross-linking agents include those listed as such in the Sigma 
and Pierce catalogues, for example glutaraldehyde, carbodiimide and 
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succinimidyl 4-(N-maleimidomethyl)cyclohexane- 1 -carboxylate, the latter 
agent exploiting the -SH group on the C-terminal cysteine residue (if present). 

If the peptide is prepared by expression of a suitable nucleotide sequence in a 
5 suitable host, then it may be advantageous to express the peptide as a fusion 
product with a peptide sequence which acts as a carrier. Kabigen's "Ecosec" 
system is an example of such an arrangement. 

Peptides may be synthesised by the Fmoc-polyamide mode of solid-phase 

10 peptide synthesis as disclosed by Lu et al (1981) J. Org, Chem. 46, 3433 and 
references therein. Temporary N-amino group protection is afforded by the 9- 
fluorenylmethyloxycarbonyl (Fmoc) group. Repetitive cleavage of this highly 
base-labile protecting group is effected using 20% piperidine in N,N- 
dimethylformamide. Side-chain functionalities may be protected as their butyl 

15 ethers (in the case of serine threonine and tyrosine), butyl esters (in the case of 
glutamic acid and aspartic acid), butyloxycarbonyl derivative (in the case of 
lysine and histidine), trityl derivative (in the case of cysteine) and 4-methoxy- 
2,3,6-trimethylbenzenesulphonyl derivative (in the case of arginine); Where 
glutamine or asparagine are C-terminal residues, use is made of the 4,4- 

20 dimethoxybenzhydryl group for protection of the side chain amido 
functionalities. The solid-phase support is based on a polydimethyl-acrylamide 
polymer constituted from the three monomers dimethylacrylamide (backbone- 
monomer), bisacryloylethylene diamine (cross linker) and acryloylsarcosine 
methyl ester (functionalising agent). The peptide-to-resin cleavable linked 

25 agent used is the acid-labile 4-hydroxymethyl-phenoxyacetic acid derivative. 
All amino acid derivatives are added as their preformed symmetrical anhydride 
derivatives with the exception of asparagine and glutamine, which are added 
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Antibodies can be raised in an animal by immunising with an appropriate 
peptide. Appropriate peptides are described herein. Alternatively, with 
today's technology, it is possible to make antibodies as defined herein without 
the need to use animals. Such techniques include, for example, antibody phage 
5 display technology as is well known in the art. Appropriate peptides, as 
described herein, may be used to select antibodies produced in this way. 

It will be appreciated that, with the advancements in antibody technology, it 
may not be necessary to immunise an animal in order to produce an antibody. 
10 Synthetic systems, such as phage display libraries, may be used. The use of 
such systems is included in the methods of the invention and the products of 
such systems are "antibodies" for the purposes of the invention. 

It will be appreciated that such antibodies which recognise ECSM1 or ECSM4 
15 and variants or fragments thereof are useful research reagents and therapeutic 
agents, particularly when prepared as a compound of the invention as described 
above. Suitably, the antibodies of the invention are detectably labelled, for 
example they may be labelled in such a way that they may be directly or 
indirectly detected. Conveniently, the antibodies are labelled with a 
20 radioactive moiety or a coloured moiety or a fluorescent moiety, or they may 
be linked to an enzyme. Typically, the enzyme is one which can convert a 
non-coloured (or non-fluorescent) substrate to a coloured (or fluorescent) 
product. The antibody may be labelled by biotin (or streptavidin) and then 
detected indirectly using streptavidin (or biotin) which has been labelled with a 
25 radioactive moiety or a coloured moiety or a fluorescent moiety, or the like or 
they may be linked to any enzyme of the type described above. 



WO 02/36771 




PCT/GB01/04906 



78 

A nineteenth aspect of the invention provides a method of detecting endothelial 
damage or activation in an individual comprising obtaining a fluid sample from 
the individual and detecting the presence of fragments of ECSM1 or ECSM4 in 
the sample. 

5 

Preferably, the fluid sample is blood. Typically, the presence of peptide 
fragments derived from ECSM1 or ECSM4 are detected. 

In a preferred embodiment of this aspect, the presence of peptide fragments of 
10 the ECSM1 or ECSM4 polypeptides are detected using an antibody selective 
for a polypeptide whose amino acid sequence comprises a sequence given in 
either one of Figure 2 or Figure 4 or Figure 12 or fragments thereof. 
Preferably, the antibody is an antibody according to the eighteenth aspect of 
the invention. Typically, such an antibody would be detectably labelled. 

15 

Detecting or diagnosing endothelial cell damage in an individual is useful in 
diagnosing cancer or aiding diagnosis of cardiac disease, endometriosis or 
artheroslcerosis in that individual. It may be that certain levels of apparent cell . 
damage are detected in individuals who do not have cancer, cardiac disease, 
20 endometriosis or artheroslcerosis. It may be necessary to compare the amount 
of endothelial cell damage detected with amounts or levels observed, in 
individuals who are known to have cancer, cardiac disease, endometriosis or 
artheroslcerosis with the "normal" levels of apparent damage in the individual 
who does not have cancer, cardiac disease, endometriosis or artheroslcerosis. 

25 

Hence, detection of endothelial damage or activation in an individual may be 
useful as a means of detecting the presence or extent or growth rate of a tumour 
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in that individual. The detection of vessel damage is an indirect report of the 
formation of tumour neovasculature. In this way, ECSM4 or ECSM1 may be 
suiTogate markers of angiogenesis. The presence of ECSM4 or ECSM1 
fragments in a sample from the individual, or more ECSM4 or ECSM1 
5 polypeptide fragments than in an individual who does not have a tumour, may 
be a means of detecting a tumour, or growth of a known tumour, in that 
individual. 

Furthermore, it will be appreciated that detection of neovasculature by means 
10 of detecting the presence of, or a certain level of, ECSM4 or ECSM1 in a 
sample from an individual may be useful in determining if a treatment in that 
individual is being effective, and/or to what extent the treatment is effective. 
Preferably the therapy is to treat a tumour or cancer in the individual. 

15 Hence, an aspect of the invention provides a method of detecting a tumour or 
tumour neovasculature or cardiac disease or endometriosis or artherosclerosis 
in an individual comprising obtaining a fluid sample from the individual and 
detecting the presence of fragments of ECSM1 or ECSM4 in the sample. 

20 As described above in relation to detecting or diagnosing endothelial cell 
damage, detection of the disease (such as a tumour or cardiac disease etc) by 
means of detecting the presence of, or a certain level of, ECSM4 or ECSM1 in 
a sample from an individual may be useful in determining the efficacy of a 
treatment in that individual. 

25 

In one embodiment, the therapy is gene therapy. 
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Preferably, the efficacy of the a treatment in an individual is determined using 
the amount of fragments of ECSM1 or ECSM4 found in the fluid sample of the 
individual and comparing it to either to the amount of ECSM1 or ECSM4 
fragments in a sample from an individual who does not have cancer, cardiac 
5 disease, endometriosis or artherosclerosis and/or to the amount in a sample 
from the individual prior to commencement of said treatment. The comparison 
indicates the efficacy of treatment of the individual, wherein if there is no 
change in the amount of fragments determined before and during/after 
treatment this is indicative of poor efficacy of the treatment. A decrease in the 
10 amount of fragments found during or after treatment compared to the amount 
found before treatment was started indicates some efficacy of the treatment in 
ameliorating the condition of the individual. 

Current methods of assessing the efficacy of various anti-angiogenic therapies 
15 being tested in clinical trials are invasive. The selective expression of ECSM4 
on endothelial cells of angiogenic blood vessels means that detecting the 
presence, absence, increase or decrease in the level of ECSM1 or ECSM4 in a 
subject undergoing therapy is a means of determining the efficacy of the 
therapy in that subject without the need, or with a reduced need, for invasive 
20 biopsies, scans and the such like. 

Hence, determination of the level of ECSM1 and or ECSM4 fragments in the 
blood of an individual undergoing an anti-angiogenic therapy (such as cancer 
therapy) may act as a "surrogate marker of angiogenesis". 

25 

By "peptide fragments derived from ECSM1 or ECSM4" we mean peptides 
which have at least 5 consecutive amino acids of the ECSM4 or ECSM1 
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polypeptide. Typically, the fragments have at least 8 consecutive amino acids, 
preferably at least 10, more preferably at least 12 or 15 or 20 or 30 or 40 or 50 
consecutive amino acids of the ECSM4 or ECSM1 polypeptide. 

5 Methods for detecting the presence of fragments of peptides derived from 
larger polypeptides are known in the art. 

A further aspect of the invention provides a method, of modulating 
angiogenesis in an individual, the method comprising administering to the 
10 individual ESCM4 or a peptide fragment of ECSM4 or a ligand of ECSM4 or 
an antibody which selectively binds to ECSM4 or ECSML 

Preferably, the peptide fragment or ligand or antibody is one which modulates 
the activity or function, either directly or indirectly, of the ECSM4 polypeptide 
15 of the individual. 

Preferred antibodies are those as described in more detail above. 

The production of antibodies which modulate the function of a polypeptide 
20 exposed on the cell surface is known in the art and is discussed in more detail 
above. Such antibodies may modulate the function by imitating the function of 
the natural ligand and stimulating the polypeptide into activity or function, or 
may modulate the polypeptide function by preventing stimulation of the 
polypeptide by the ligand by sterically obscuring the ligand binding site 
25 thereby preventing binding of the natural ligand. 
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Delivery of a ligand to magic roundabout might be an angiogenic inhibitor 
useful in therapy of cancer or other diseases involving hyper-angiogenesis. 
Also, introduction of the ECSM4 polypeptide to endothelial cells by gene 
therapy using the ECSM4 encoding polynucleotide might alter growth and 
5 migration. 

A still further aspect of the invention provides a method of diagnosing a 
condition which involves aberrant or excessive growth of vascular endothelium 
in an individual comprising obtaining a sample containing nucleic acid from 
10 the individual and contacting said sample with a polynucleotide which 
selectively hybridises to a nucleic acid which encodes the ECSM4 polypeptide 
or the ECSM1 polypeptide or a fragment or natural variant thereof 

The method may be used for aiding diagnosis. 

15 

A condition which involves aberrant or excessive growth of vascular 
endothelium such as cancer, artherosclerosis, restenosis, diabetic retinopathy, 
arthritis, psoriasis, endometriosis, menorrhagia, haemangiomas and venous 
malformations may be caused by a mutation in the nucleic acid which encodes 
20 the ECSM1 or ECSM4 polypeptides. 

By "selectively hybridising" is meant that the nucleic acid has sufficient 
nucleotide sequence similarity with the said human DNA or cDNA that it can 
hybridise under moderately or highly stringent conditions. As is well known in 
25 the art, the stringency of nucleic acid hybridization depends on factors such as 
length of nucleic acid over which hybridisation occurs, degree of identity of the 
hybridizing sequences and on factors such as temperature, ionic strength and 
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CG or AT content of the sequence. Thus, any nucleic acid which is capable of 
selectively hybridising as said is useful in the practice of the invention. 

Nucleic acids which can selectively hybridise to the said human DNA or 
cDNA include nucleic acids which have >95% sequence identity, preferably 
those with >98%, more preferably those with >99% sequence identity, over at 
least a portion of the nucleic acid with the said human DNA or cDNA. As is 
well known, human genes usually contain introns such that, for example, a 
mRNA or cDNA derived from a gene within the said human DNA would not 
match perfectly along its entire length with the said human DNA but would 
nevertheless be a nucleic acid capable of selectively hybridising to the said 
human DNA. Thus, the invention specifically includes nucleic acids which 
selectively hybridise to an ECSM4 or ECSM1 cDNA but may not hybridise to 
an ECSM4 or ECSM1 gene, or vice versa. For example, nucleic acids which 
span the intron-exon boundaries of the ECSM4 or ECSM1 gene may not be 
able to selectively hybridise to the ECSM4 or ECSM1 cDNA. 

Typical moderately or highly stringent hybridisation conditions which lead to 
selective hybridisation are known in the art, for example those described in 
Molecular Cloning, a laboratory manual, 2nd edition, Sambrook et al (eds), 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, USA, 
incorporated herein by reference. 

An example of a typical hybridisation solution when a nucleic acid is 
immobilised on a nylon membrane and the probe nucleic acid is > 500 bases or 
base pairs is: 
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6 x SSC (saline sodium citrate) 

0.5% sodium dodecyl sulphate (SDS) 

100 jig/ml denatured, fragmented salmon sperm DNA 

5 The hybridisation is performed at 68°C. The nylon membrane, with the nucleic 
acid immobilised, may be washed at 68°C in 1 x SSC or, for high stringency, 
0.1 x SSC. 

20 x SSC may be prepared in the following way. Dissolve 175.3 g of NaCl and 
10 88.2 g of sodium citrate in 800 ml of H 2 0. Adjust the pH to 7.0 with a few 
drops of a 10 N solution of NaOH. Adjust the volume to 1 litre with H 2 0. 
Dispense into aliquots. Sterilize by autoclaving. 

An example of a typical hybridisation solution when a nucleic acid is 
15 immobilised on a nylon membrane and the probe is an oligonucleotide of 
between 15 and 50 bases is: 

3.0 M trimethylammonium chloride (TMAC1) 
0.01 M sodium phosphate (pH 6.8) 
20 1 mm EDTA (pH 7.6) 
0.5% SDS 

100 jug/ml denatured, fragmented salmon sperm DNA 
0.1% nonfat dried milk 

25 The optimal temperature for hybridization is usually chosen to be 5°C below 
the Tj for the given chain length. Tj is the irreversible melting temperature of 
the hybrid formed between the probe and its target sequence. Jacobs et al 
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(1988) Nucl Acids Res. 16, 4637 discusses the determination of Tjs. The 
recommended hybridization temperature for 17-mers in 3 M TMAC1 is 48- 
50°C; for 19-mers, it is 55-57°C; and for 20-mers, it is 58-66°C. 

5 By "nucleic acid which selectively hybridises" is also included nucleic acids 
which will amplify DNA from the said region of human DNA by any of the 
well known amplification systems such as those described in more detail 
below, in particular the polymerase chain reaction (PGR). Suitable conditions 
for PCR amplification include amplification in a suitable 1 x amplification 
10 buffer: 

10 x amplification buffer is 500 mM KC1; 100 mM Tris.Cl (pH 8.3 at room 
temperature); 1 5 mM MgCl 2 ; 0.1% gelatin. 

15 A suitable denaturing agent or procedure (such as heating to 95°C) is used in 
order to separate the strands of double-stranded DNA. 

Suitably, the annealing part of the amplification is between 37°C and 60°C, 
preferably 50°C. 

20 

Although the nucleic acid which is useful in the methods of the invention may 
be RNA or DNA, DNA is preferred. Although the nucleic acid which is useful 
in the methods of the invention may be double-stranded or single-stranded, 
single-stranded nucleic acid is preferred under some circumstances such as in 
25 nucleic acid amplification reactions. 
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The sample may be directly derived from the patient, for example, by biopsy of 
a tissue which may be associated with aberrant vascular development, or it may 
be derived from the patient from a site remote from the tissue, for example 
because cells from the tissue have migrated from the tissue to other parts of the 
5 body. Alternatively, the sample may be indirectly derived from the patient in 
the sense that, for example, the tissue or cells therefrom may be cultivated in 
vitro, or cultivated in a xenograft model; or the nucleic acid sample may be one 
which has been replicated (whether in vitro or in vivo) from nucleic acid from 
the original source from the patient. Thus, although the nucleic acid derived 
10 from the patient may have been physically within the patient, it may 
alternatively have been copied from nucleic acid which was physically within 
the patient. When aberrant vascular development is believed to be associated 
with a tumour, tumour tissue may be taken from the primary tumour or from 
metastases. 

15 

It will be appreciated that a useful method of the invention includes the 
analysis of mutations in, or the detection of the presence or absence of, the 
ECSM4 or ECSM1 gene in any suitable sample. The sample may suitably be a 
freshly-obtained sample from the patient, or the sample may be an historic 
20 sample, for example a sample held in a library of samples. 

Conveniently, the nucleic acid capable of selectively hybridising to the said 
human DNA and which is used in the methods of the invention further 
comprises a detectable label. 

25 

By "detectable label" is included any convenient radioactive label such as 32 P, 
33 P or 35 S which can readily be incorporated into a nucleic acid molecule using 
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well known methods; any convenient fluorescent or chemiluminescent label 
which can readily be incorporated into a nucleic acid is also included. In 
addition the term "detectable label" also includes a moiety which can be 
detected by virtue of binding to another moiety (such as biotin which can be 

5 detected by binding to streptavidin); and a moiety, such as an enzyme, which 
can be detected by virtue of its ability to convert a colourless compound into a 
coloured compound, or vice versa (for example, alkaline phosphatase can 
convert colourless o-nitrophenylphosphate into coloured o-nitrophenol). 
Conveniently, the nucleic acid probe may occupy a certain position in a fixed 

10 assay and whether the nucleic acid hybridises to the said region of human DNA 
can be determined by reference to the position of hybridisation in the fixed 
assay. The detectable label may also be a fluorophore-quencher pair as 
described in Tyagi & Kramer (1996) Nature Biotechnology 14, 303-308. 

15 Conveniently, in this method of diagnosis of a condition in which vascular 
development is aberrant the nucleic acid which is capable of the said selective 
hybridisation (whether labelled with a detectable label or not) is contacted with 
a nucleic acid derived from the patient under hybridising conditions. Suitable 
hybridising conditions include those described above. 

20 

This method of diagnosing a condition in which vascular development is 
aberrant may involve sequencing of DNA at one or more of the relevant 
positions within the relevant region, including direct sequencing;, direct 
sequencing of PCR-amplified exons; differential hybridisation of an 
25 oligonucleotide probe designed to hybridise at the relevant positions within the 
relevant region (conveniently this uses immobilised oligonucleotide probes in, 
so-called, "chip" systems which are well known in the art); denaturing gel 
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electrophoresis following digestion with an appropriate restriction enzyme, 
preferably following amplification of the relevant DNA regions; SI nuclease 
sequence analysis; non-denaturing gel electrophoresis, preferably following 
amplification of the relevant DNA regions; conventional RFLP (restriction 
5 fragment length polymorphism) assays; heteroduplex analysis; selective DNA 
amplification using oligonucleotides; fluorescent insitu hybridisation (FISH) 
of interphase chromosomes; ARMS-PCR (Amplification Refractory Mutation 
System-PCR) for specific mutations; cleavage at mismatch sites in hybridised 
nucleic acids (the cleavage being chemical or enzymic); SSCP single strand 

10 conformational polymorphism or DGGE (discontinuous or denaturing gradient 
gel electrophoresis); analysis to detect mismatch in annealed normal/mutant 
PCR-amplified DNA; and protein truncation assay (translation and 
transcription of exons - if a mutation introduces a stop codon a truncated 
protein product will result). Other methods may be employed such as detecting 

15 changes in the secondary structure of single-stranded DNA resulting from 
changes in the primary sequence, for example, using the cleavase I enzyme. 
This system is commercially available from GibcoBRL, Life Technologies, 3 
Fountain Drive, Inchinnan Business Park, Paisley PA4 9RF, Scotland. 

20 . It will be appreciated that the methods of the invention may also be carried out 
on "DNA chips". Such "chips" are described in US 5,445,934 (Asymetrix; 
probe arrays), WO 96/31622 (Oxford; probe array plus ligase or polymerase 
extension), and WO 95/22058 (Affymax; fluorescently marked targets bind to 
oligomer substrate, and location in array detected); all of these are incorporated 

25 herein by reference. 
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Detailed methods of mutation detection are described in "Laboratory Protocols 
for Mutation Detection" 1996, ed. Landegren, Oxford University Press on 
behalf of HUGO (Human Genome Organisation). 

5 It is preferred if RFLP is used for the detection of fairly large (> 500bp) 
deletions or insertions. Southern blots may be used for this method of the 
invention. 

PCR amplification of smaller regions (maximum 300bp) to detect small 
10 changes greater than 3-4 bp insertions or deletions may . be preferred. 
Amplified sequence may be analysed on a sequencing gel, and small changes 
(minimum size 3-4 bp) can be visualised. Suitable primers are designed as 
herein described. 

15 In addition, using either Southern blot analysis or PCR restriction enzyme 
variant sites may be detected. For example, for analysing variant sites in 
genomic DNA restriction enzyme digestion, gel electrophoresis, Southern 
blotting, and hybridisation specific probe (for example any suitable fragment 
derived from the ECSM4 or ECSM1 cDNA or gene). 

20 

For example, for analysing variant sites using PCR DNA amplification, 
restriction enzyme digestion, gel detection by ethidium bromide, silver staining 
or incorporation of radionucleotide or fluorescent primer in the PCR. 

25 Other suitable methods include the development of allele specific 
oligonucleotides (ASOs) for specific mutational events. Similar methods are 
used on RNA and cDNA for the suitable tissue. 
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Whilst it is useful to detect mutations in any part of the ECSM4 or ECSM1 
gene, it is preferred if the mutations are detected in the exons of the gene and it 
is further preferred if the mutations are ones which change the coding sense. 
5 The detection of these mutations is a preferred aspect of the invention. 

The methods of the invention also include checking for loss-of-heterozygosity 
(LOH; shows one copy lost). LOH may be a sufficient marker for diagnosis; 
looking for mutation/loss of the second allele may not be necessary. LOH of 
10 the gene may be detected using polymorphisms in the coding sequence, and 
introns, of the gene. 

Particularly preferred nucleic acids for use in the aforementioned methods of 
the invention are those selected from the group consisting of primers suitable 
15 for amplifying nucleic acid. 

Suitably, the primers are selected from the group consisting of primers which 
hybridise to the nucleotide sequences shown in any of the Figures which show 
ECSM4 or ECSM1 gene or cDNA sequences. It is particularly preferred if the 
20 primers hybridise to the introns of the ECSM4 or ECSM1 gene or if the 
primers are ones which will prime synthesis of DNA from the ECSM4 or 
ECSM1 gene or cDNA but not from other genes or cDNAs. 

Primers which are suitable for use in a polymerase chain reaction (PCR; Saiki 
25 et al (1988) Science 239, 487-491) are preferred. Suitable PCR primers and 
methods of detecting products of PCR reactions are described in detail above. 
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Any of the nucleic acid amplification protocols can be used in the method of 
the invention including the polymerase chain reaction, QB replicase and ligase 
chain reaction. Also, NASBA (nucleic acid sequence based amplification), 
also called 3SR, can be used as described in Compton (1991) Nature 350, 91- 
5 92 and AIDS (1993), Vol 7 (Suppl 2), S108 or SDA (strand displacement 
amplification) can be used as described in Walker et al (1992) Nucl Acids Res. 
20, 1691-1696. The polymerase chain reaction is particularly preferred 
because of its simplicity. 

10 The present invention provides the use of a nucleic acid which selectively 
hybridises to the human-derived DNA of genomic clones as described in Table 
8 of Example 1 or to the ECSM4 or ECSM1 gene, or a mutant allele thereof, or 
a nucleic acid which selectively hybridises to ECSM4 or ECSM1 cDNA or a 
mutant allele thereof, or their complement in a method of diagnosing a 

15 condition in which vascular development is aberrant; or in the manufacture of a 
reagent for carrying out these methods. 

Preferred polynucleotides which selectively hybridise to the ECSM4 gene or 
cDNA are as described above in relation to a method of diagnosis. 

20 

Also, the present invention provides a method of determining the presence or 
absence, or mutation in, the said ECSM4 or ECSM1 gene. Preferably, the 
method uses a suitable sample from a patient. 

25 The methods of the invention include the detection of mutations in the ECSM4 
orECSMl gene. 
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The methods of the invention may make use of a difference in restriction 
enzyme cleavage sites caused by mutation. A non-denaturing gel may be used 
to detect differing lengths of fragments resulting from digestion with an 
appropriate restriction enzyme. 

5 

An "appropriate restriction enzyme" is one which will recognise and cut the 
wild-type sequence and not the mutated sequence or vice versa. The sequence 
which is recognised and cut by the restriction enzyme (or not, as the case may 
be) can be present as a consequence of the mutation or it can be introduced into 
10 the normal or mutant allele using mismatched oligonucleotides in the PCR 
reaction. It is convenient if the enzyme cuts DNA only infrequently, in other 
words if it recognises a sequence which occurs only rarely. 

In another method, a pair of PCR primers are used which match (ie hybridise 
15 to) either the wild-type genotype or the mutant genotype but not both. Whether 
amplified DNA is produced will then indicate the wild-type or mutant 
genotype (and hence phenotype). However, this method relies partly on a 
negative result (ie the absence of amplified DNA) which could be due to a 
technical failure. It therefore may be less reliable and/or requires additional 
20 control experiments. 

A preferable method employs similar PCR primers but, as well as hybridising 
to only one of the wild-type or mutant sequences, they introduce a restriction 
site which is not otherwise there in either the wild-type or mutant sequences. 

25 

The nucleic acids which selectively hybridise to the ECSM4 or ECSM1 gene 
or cDNA, or which selectively hybridise to the genomic clones containing 
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ECSM4 or ECSM1 as listed in Table 8 of Example 1 are useful for a number of 
purposes. They can be used in Southern hybridization to genomic DNA and in 
the RNase protection method for detecting point mutations already discussed 
above. The probes can be used to detect PCR amplification products. They 
5 may also be used to detect mismatches with the ECSM4 or ECSM1 gene or 
mRNA in a sample using other techniques. Mismatches can be detected using 
. either enzymes (eg SI nuclease or resolvase), chemicals (eg hydroxylamine or 
osmium tetroxide and piperidine), or changes in electrophoretic mobility of 
mismatched hybrids as compared to totally matched hybrids. These techniques 

10 are known in the art. Generally, the probes are complementary to the ECSM4 
or ECSM1 gene coding sequences, although probes to certain introns are also 
contemplated. A battery of nucleic acid probes may be used to compose a kit 
for detecting loss of or mutation in the wild-type ECSM4 or ECSM1 gene. 
The kit allows for hybridization to the entire ECSM4 or ECSM1 gene. The 

15 probes may overlap with each other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, it is complementary to 
the mRNA of the human ECSM4 or ECSM1 gene. The riboprobe thus is an 
anti-sense probe in that it does not code for the protein encoded by the ECSM4 
20 or ECSM1 gene because it is of the opposite polarity to the sense strand. The 
riboprobe generally will be labelled, for example, radioactively labelled which 
can be accomplished by any means known in the art. If the riboprobe is used 
to detect mismatches with DNA it can be of either polarity, sense or anti-sense. 
Similarly, DNA probes also may be used to detect mismatches. 

25 

Nucleic acid probes may also be complementary to mutant alleles of the 
ECSM4 or ECSM1 gene. These are useful to detect similar mutations in other 
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patients on the basis of hybridization rather than mismatches. As mentioned 
above, the ECSM4 or ECSM1 gene probes can also be used in Southern 
hybridizations to genomic DNA to detect gross chromosomal changes such as 
deletions and insertions. 

5 

Particularly useful methods of detecting a mutation in the ECSM1 or ECSM4 
genes include single strand conformation polymorphism (SSCP), hetero duplex 
analysis, polymerase chain reaction, using DNA chips and sequencing. 

10 Any sample containing nucleic acid derived from the individual is useful in the 
methods of the invention. It is preferred if the nucleic acid in the sample is 
DNA. Thus, samples from cells may be obtained as is well known in the art, 
for example from blood samples or cheek cells or the like. Where the methods 
are being used to determine the presence or absence of a mutation in an unborn 

15 child, it is preferred if the sample is a maternal sample containing nucleic acid 
from the unborn child. Suitable maternal samples include the amniotic fluid of 
the mother, chorionic villus samples and blood samples from which foetal cells 
can be isolated. 

20 A further aspect of the invention provides a method of reducing the expression 
of the ECSM4 or ECSM1 polynucleotide in an individual, comprising 
administering to the individual an agent which selectively prevents expression 
ofECSM4orECSMl. 

25 In a preferred embodiment, the agent which selectively prevents expression of 
ECSM4 or ECSM1 is an antisense nucleic acid. 
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DNA-DNA, or RNA-DNA duplex is formed. These nucleic acids are often 
termed "antisense" because they are complementary to the sense or coding 
strand of the gene. Recently, formation of a triple helix has proven possible 
where the oligonucleotide is bound to a DNA duplex. It was found that 
5 oligonucleotides could recognise sequences in the major groove of the DNA 
double helix. A triple helix was formed thereby. This suggests that it is 
possible to synthesise a sequence-specific molecules which specifically bind 
double-stranded DNA via recognition of major groove hydrogen binding sites. 

10 By binding to the target nucleic acid, the above oligonucleotides can inhibit the 
function of the target nucleic acid. This could, for example, be a result of 
blocking the transcription, processing, poly(A)addition, replication, translation, 
or promoting inhibitory mechanisms of the cells, such as promoting RNA 
degradations. 

15 

Antisense oligonucleotides are prepared in the laboratory and then introduced 
into cells, for example by microinjection or uptake from the cell culture 
medium into the cells, or they are expressed in cells after transfection with 
plasmids or retroviruses or other vectors carrying an antisense gene. Antisense 

20 oligonucleotides were first discovered to inhibit viral replication or expression 
in cell culture for Rous sarcoma virus, vesicular stomatitis virus, herpes 
simplex virus type 1, simian virus and influenza virus. Since then, inhibition of 
mRNA translation by antisense oligonucleotides has been studied extensively 
in cell-free systems including rabbit reticulocyte lysates and wheat germ 

25 extracts. Inhibition of viral function by antisense oligonucleotides has been 
demonstrated in vitro using oligonucleotides which were complementary to the 
AIDS HIV retrovirus RNA (Goodchild, J. 1988 "Inhibition of Human 
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Immunodeficiency Virus Replication by Antisense Oligodeoxynucleotides" 
Proc. Natl Acad, Set (USA) 85(15), 5507-11). The Goodchild study showed 
that oligonucleotides that were most effective were complementary to the 
poly(A) signal; also effective were those targeted at the 5 7 end of the RNA, 
5 particularly the cap and 5' untranslated region, next to the primer binding site 
and at the primer binding site. The cap, 5' untranslated region, and poly(A) 
signal lie within the sequence repeated at the ends of retrovirus RNA (R 
region) and the oligonucleotides complementary to these may bind twice to the 
RNA. 

10 

Typically, antisense oligonucleotides are 15 to 35 bases in length. For example, 
20-mer oligonucleotides have been shown to inhibit the expression of the 
epidermal growth factor receptor mRNA (Witters et al, Breast Cancer Res 
Treat 53:41-50 (1999)) and 25-mer oligonucleotides have been shown to 
15 decrease the expression of adrenocorticotropic hormone by greater than 90% 
(Frankel et al, JNeurosurg 91:261-7 (1999)). However, it is appreciated that it 
may be desirable to use oligonucleotides with lengths outside this range, for 
example 10, 1 1, 12, 13, or 14 bases, or 36, 37, 38, 39 or 40 bases. 

20 Oligonucleotides are subject to being degraded or inactivated by cellular 
endogenous nucleases. To counter this problem, it is possible to use modified 
oligonucleotides, eg having altered intemucleotide linkages, in which the 
naturally occurring phosphodiester linkages have been replaced with another 
linkage. For example, Agrawal et al (1988) Proc. Natl Acad. Set USA 85, 7079- 

25 7083 showed increased inhibition in tissue culture of HIV-1 using 
oligonucleotide phosphoramidates and phosphorothioates. Sarin et al (1988) 
Proa Natl Acad, Set USA 85, 7448-7451 demonstrated increased inhibition of 
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HIV-1 using oligonucleotide methylphosphonates. Agrawal et al (1989) Proc. 
Natl Acad. Sci. USA 86, 7790-779 >4 showed inhibition of HIV-1 replication in 
both early-infected and chronically infected cell cultures, using nucleotide 
sequence-specific oligonucleotide phosphorothioates. Leither et al (1990) Proc. 
5 Natl Acad. Set USA 87, 3430-3434 report inhibition in tissue culture of influenza 
virus replication by oligonucleotide phosphorothioates. 

Oligonucleotides having artificial linkages have been shown to be resistant to 
degradation in vivo. For example, Shaw et al (1991) in Nucleic Acids Res. 19, 
10 747-750, report that otherwise unmodified oligonucleotides become more 
resistant to nucleases in vivo when they are blocked at the 3 r end by certain 
capping structures and that uncapped oligonucleotide phosphorothioates are not 
degraded in vivo. 

15 A detailed description of the H-phosphonate approach to synthesizing 
oligonucleoside phosphorothioates is provided in Agrawal and Tang (1990) 
Tetrahedron Letters 31, 7541-7544, the teachings of which are hereby 
incorporated herein by reference. Syntheses of oligonucleoside 
methylphosphonates, phosphorodithioates, phosphoramidates, phosphate esters, 

20 bridged phosphoramidates and bridge phosphorothioates are known in the art. 
See, for example, Agrawal and Goodchild (1987) Tetrahedron Letters 28, 3539; 
Nielsen et al (1988) Tetrahedron Letters 29, 2911; Jager et al (1988) 
Biochemistry 27, 7237; Uznanski et al (1987) Tetrahedron Letters 28, 3401; 
Bannwarth (1988) Helv. Chim. Acta. 71, 1517; Crosstick and Vyle (1989) 

25 Tetrahedron Letters 30, 4693; Agrawal et al (1990) Proc. Natl Acad. Sci. USA 
87, 1401-1405, the teachings of which are incorporated herein by reference. 
Other methods for synthesis or production also are possible. In a preferred 
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embodiment the oligonucleotide is a deoxyribonucleic acid (DNA), although 
ribonucleic acid (RNA) sequences may also be synthesized and applied. 

The oligonucleotides useful in the invention preferably are designed to resist 
5 degradation by endogenous nucleolytic enzymes. In vivo degradation of 
oligonucleotides produces oligonucleotide breakdown products of reduced 
length. Such breakdown products are more likely to engage in non-specific 
hybridization and are less likely to be effective, relative to their full-length 
counterparts. Thus, it is desirable to use oligonucleotides that are resistant to 

10 degradation in the body and which are able to reach the targeted cells. The 
present oligonucleotides can be rendered more resistant to degradation in vivo by 
substituting one or more internal artificial internucleotide linkages for the native 
phosphodiester linkages, for example, by replacing phosphate with sulphur in the 
linkage. Examples of linkages that may be used include phosphorothioates, 

15 methylphosphonates, sulphone, sulphate, ketyl, phosphorodithioates, various 
phosphoramidates, phosphate esters, bridged phosphorothioates and bridged 
phosphoramidates. Such examples are illustrative, rather than limiting, since 
other internucleotide linkages are known in the art. See, for example, Cohen, 
(1990) Trends in Biotechnology. The synthesis of oligonucleotides having one or 

20 more of these linkages substituted for the phosphodiester internucleotide linkages 
is well known in the art, including synthetic pathways for producing 
oligonucleotides having mixed internucleotide linkages. 

Oligonucleotides can be made resistant to extension by endogenous enzymes by 
25 "capping" or incorporating similar groups on the 5' or 3' terminal nucleotides. A 
reagent for capping is commercially available as Amino-Link II™ from Applied 
BioSystems Inc, Foster City, GA Methods for capping are described, for 
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example, by Shaw et al (1991) Nucleic Acids Res. 19, 747-750 and Agrawal et al 
(1991) Proc. Natl Acad. Set USA 88(17), 7595-7599, the teachings of which are 
hereby incorporated herein by reference. 

5 A further method of making oligonucleotides resistant to nuclease attack is for 
them to be "self-stabilized" as described by Tang et al (1993) Nucl Acids Res. 21, 
2729-2735 incorporated herein by reference. Self-stabilized oligonucleotides 
have hairpin loop structures at their 3 f ends, and show increased resistance to 
degradation by snake venom phosphodiesterase, DNA polymerase I and fetal 
10 bovine serum. The self-stabilized region of the oligonucleotide does not interfere 
in hybridization with complementary nucleic acids, and pharmacokinetic and 
stability studies in mice have shown increased in vivo persistence of self- 
stabilized oligonucleotides with respect to their linear counterparts. 

15 In accordance with the invention, the antisense compound may be administered 
systemically. Alternatively the inherent binding specificity of antisense 
oligonucleotides characteristic of base pairing is enhanced by limiting the 
availability of the antisense compound to its intended locus in vivo, permitting 
lower dosages to be used and minimising systemic effects. Thus, 

20 oligonucleotides may be applied locally to achieve the desired effect. The 
concentration of the oligonucleotides at the desired locus is much higher than if 
the oligonucleotides were administered systemically, and the therapeutic effect 
can be achieved using a significantly lower total amount. The local high 
concentration of oligonucleotides enhances penetration of the targeted cells and 

25 effectively blocks translation of the target nucleic acid sequences. 
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The oligonucleotides can be delivered to the locus by any means appropriate for 
localised administration of a drug. For example, a solution of the 
oligonucleotides can be injected directly to the site or can be delivered by 
infusion using an infusion pump. The oligonucleotides also can be incorporated 
into an implantable device which when placed at the desired site, permits the 
oligonucleotides to be released into the surrounding locus. 

The oligonucleotides may be administered via a hydrogel material. The hydrogel 
is non-inflammatory and biodegradable. Many such materials now are known, 
including those made from natural and synthetic polymers. In a preferred 
embodiment, the method exploits a hydrogel which is liquid below body 
temperature but gels to form a shape-retaining semisolid hydrogel at or near body 
temperature. Preferred hydrogel are polymers of ethylene oxide-propylene oxide 
repeating units. The properties of the polymer are dependent on the molecular 
weight of the polymer and the relative percentage of polyethylene oxide and 
polypropylene oxide in the polymer. Preferred hydrogels contain from about 
10% to about 80% by weight ethylene oxide and from about 20% to about 90% 
by weight propylene oxide. A particularly preferred hydrogel contains about 
70% polyethylene oxide and 30% polypropylene oxide. Hydrogels which can be 
used are available, for example, from BASF Corp., Parsippany, NJ, under the 
tradename Pluronic R . 

In this embodiment, the hydrogel is cooled to a liquid state and the 
oligonucleotides are admixed into the liquid to a concentration of about 1 mg 
oligonucleotide per gram of hydrogel. The resulting mixture then is applied 
onto the surface to be treated, for example by spraying or painting during 
surgery or using a catheter or endoscopic procedures. As the polymer warms, 
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it solidifies to form a gel, and the oligonucleotides diffuse out of the gel into 
the suirounding cells over a period of time defined by the exact composition of 
the gel. 

5 It will be appreciated that the oligonucleotides or other agents may be 
administered after surgical removal of a tumour, and may be administered to 
the area from which the tumour has been removed, and surrounding tissue, for 
example using cytoscopy to guide application of the oligonucleotides or other 
agents. 

The oligonucleotides can be administered by means of other implants that are 
commercially available or described in the scientific literature, including 
liposomes, microcapsules and implantable devices. For example, implants made 
of biodegradable materials such as polyanhydrides, polyorthoesters, polylactic 

15 acid and polyglycolic acid and copolymers thereof, collagen, and protein 
polymers, or non-biodegradable materials such as ethylenevinyl acetate (EVAc), 
polyvinyl acetate, ethylene vinyl alcohol, and derivatives thereof can be used to 
locally deliver the oligonucleotides. The oligonucleotides can be incorporated 
into the material as it is polymerised or solidified, using melt or solvent 

20 evaporation techniques, or mechanically mixed with the material. In one 
embodiment, the oligonucleotides are mixed into or applied onto coatings for 
implantable devices such as dextran coated silica beads, stents, or catheters. 

The dose of oligonucleotides is dependent on the size of the oligonucleotides and 
25 the purpose for which is it administered. In general, the range is calculated based 
on the surface area of tissue to be treated. The effective dose of oligonucleotide 
is somewhat dependent on the length and chemical composition of the 
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oligonucleotide but is generally in the range of about 30 to 3000 ^g per square 
centimetre of tissue surface area. 

The oligonucleotides may be administered to the patient systemically for both 
5 therapeutic and prophylactic purposes. The oligonucleotides may be 
administered by any effective method, for example, parenterally (eg 
intravenously, subcutaneously, intramuscularly) or by oral, nasal or other means 
which permit the oligonucleotides to access and circulate in the patient's 
bloodstream. Oligonucleotides administered systemically preferably are given in 
10 addition to locally administered oligonucleotides, but also have utility in the 
absence of local administration. A dosage in the range of from about 0.1 to about 
10 grams per administration to an adult human generally will be effective for this 
purpose. 

15 It will be appreciated that antisense agents also include larger molecules which 
bind to said ECSM4 or ECSM1 mRNA or genes and substantially prevent 
expression of said ECSM4 or ECSM1 mRNA or genes and substantially prevent 
expression of said ECSM4 or ECSM1 protein. Thus, expression of an antisense 
molecule which is substantially complementary to said ECSM4 or ECSM1 

20 mRNA is envisaged as part of the invention. 

The said larger molecules may be expressed from any suitable genetic construct 
as is described below and delivered to the patient. Typically, the genetic 
construct which expresses the antisense molecule comprises at least a portion of 
25 the said ECSM4 or ECSM1 cDNA or gene operatively linked to a promoter 
which can express the antisense molecule in a cell. Promoters that may be active 
in endothelial cells are described below. 
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Although the genetic construct can be DNA or RNA it is preferred if it is DNA. 
Preferably, the genetic construct is adapted for delivery to a human cell. 

5 

Means and methods of introducing a genetic construct into a cell in an animal 
body are known in the art. For example, the constructs of the invention may be 
introduced into proliferating endothelial cells by any convenient method, for 
example methods involving retroviruses, so that the construct is inserted into 

10 the genome of the endothelial cell. For example, in Kuriyama et al (1991) Cell 
Struc. and Func. 16, 503-510 purified retroviruses are administered. 
Retroviruses provide a potential means of selectively infecting proliferating 
endothelial cells because they can only integrate into the genome of dividing 
cells; most endothelial cells are in a quiescent, non-receptive stage of cell 

15 growth or, at least, are dividing much less rapidly than angiogenic cells. 
Retroviral DNA. constructs which encode said antisense agents may be made 
using methods well known in the art. To produce active retrovirus from such a 
construct it is usual to use an ecotropic psi2 packaging cell line grown in 
Dulbecco's modified Eagle's medium (DMEM) containing 10% foetal calf 

20 serum (FCS). Transfection of the cell line is conveniently by calcium 
phosphate co-precipitation, and stable transformants are selected by addition of 
G418 to a final concentration of 1 mg/ml (assuming the retroviral construct 
contains a neo K gene). Independent colonies are isolated and expanded and the 
culture supernatant removed, filtered through a 0.45 urn pore-size filter and 

25 stored at -70°. For the introduction of the retrovirus into the tumour cells, it is 
convenient to inject directly retroviral supernatant to which 10 ug/ml 
Polybrene has been added. For tumours exceeding 10 mm in diameter it is 
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appropriate to inject between 0.1 ml and 1 ml of retroviral supernatant; 
preferably 0.5 ml. 

Alternatively, as described in Culver et al (1992) Science 256, 1550-1552, cells 
5 which produce retroviruses are injected into specific tissue. The retrovirus- 
producing cells so introduced are engineered to actively produce retroviral 
vector particles so that continuous productions of the vector occurred within 
the tumour mass in situ. Thus, proliferating endothelial cells can be 
successfully transduced in vivo if mixed with retroviral vector-producing cells. 

10 

Targeted retroviruses are also available for use in the invention; for example, 
sequences conferring specific binding affinities may be engineered into pre- 
existing viral env genes (see Miller & Vile (1995) Faseb J. 9, 190-199 for a 
review of this and other targeted vectors for gene therapy). 

15 

Other methods involve simple delivery of the construct into the cell for 
expression therein either for a limited time or, following integration into the 
genome, for a longer time. An example of the latter approach includes 
(preferably endothelial-cell-targeted) liposomes (Nassander et al (1992) 
20 Cancer Res. 52, 646-653). 

Immunoliposomes (antibody-directed liposomes) are especially useful in 
targeting to endothelial cell types which express a cell surface protein for 
which antibodies are available. 

25 

Other methods of delivery include adenoviruses carrying external DNA via an 
antibody-polylysine bridge (see Curiel Prog, Med. Virol. 40, 1-18) and 
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transferrin-polycation conjugates as carriers (Wagner et al (1990) Proc. Natl. 
Acad. Sci. USA 87, 3410-3414). In the first of these methods a polycation- 
antibody complex is formed with the DNA construct or other genetic construct 
of the invention, wherein the antibody is specific for either wild-type 
5 adenovirus or a variant adenovirus in which a new epitope has been introduced 
which binds the antibody. The polycation moiety binds the DNA via 
electrostatic interactions with the phosphate backbone. The adenovirus, 
because it contains unaltered fibre and penton proteins, is internalised into the 
cell and carries into the cell with it the DNA construct of the invention. It is 
10 preferred if the polycation is polylysine. 

The DNA may also be delivered by adenovirus wherein it is present within the 
adenovirus particle, for example, as described below. 

15 In the second of these methods, a high-efficiency nucleic acid delivery system 
that uses receptor-mediated endocytosis to carry DNA macromolecules into 
cells is employed. This is accomplished by conjugating the iron-transport 
protein transferrin to polycations that bind nucleic acids. Human transferrin, or 
the chicken homologue conalbumin, or combinations thereof is covalently 

20 linked to the small DNA-binding protein protamine or to polyly sines of various 
sizes through a disulfide linkage. These modified transferrin molecules 
maintain their ability to bind their cognate receptor 1 and to mediate efficient 
iron transport into the cell. The transferrin-polycation molecules form 
electrophoretically stable complexes with DNA constructs or other genetic 

25 constructs of the invention independent of nucleic acid size (from short 
oligonucleotides to DNA of 21 kilobase pairs). When complexes of 
transferrin-polycation and the DNA constructs or other genetic constructs of 
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the invention are supplied to the endothelial cells, a high level of expression 
from the construct in the cells is expected. 

High-efficiency receptor-mediated delivery of the DNA constructs or other 
5 genetic constructs of the invention using the endosome-disruption activity of 
defective or chemically inactivated adenovirus particles produced by the 
methods of Cotten et al (1992) Proc. Natl Acad. Set USA 89, 6094-6098 may 
also be used. This approach appears to rely on the fact that adenoviruses are 
adapted to allow release of their DNA from an endosome without passage 
10 through the lysosome, and in the presence of, for example transferrin linked to 
the DNA construct or other genetic construct of the invention, the construct is 
taken up by the cell by the same route as the adenovirus particle. 

This approach has the advantages that there is no need to use complex 
15 retroviral constructs; there is no permanent modification of the genome as 
occurs with retroviral infection; and the targeted expression system is coupled 
with a targeted delivery system, thus reducing toxicity to other cell types. 

It may be desirable to locally perfuse a tumour with the suitable delivery 
20 vehicle comprising the genetic construct for a period of time; additionally or 
alternatively the delivery vehicle or genetic construct can be injected directly 
into accessible tumours. 

It will be appreciated that "naked DNA" and DNA complexed with cationic 
25 and neutral lipids may also be useful in introducing the DNA of the invention 
into cells of the patient to be treated. Non-viral approaches to gene therapy are 
described in Ledley (1995) Human Gene Therapy 6, 1 129-1 144. 
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adenovirus system described in WO 94/10323 wherein, typically, the DNA is 
carried within the adenovirus, or adenovirus-like, particle. Michael et al 
(1995) Gene Therapy 2, 660-668 describes modification of adenovirus to add a 
cell-selective moiety into a fibre protein. Mutant adenoviruses which replicate 
selectively in p53-deficient human tumour cells, such as those described in 
Bischoff etal (1996) Science 274, 373-376 are also useful for delivering the 
genetic construct of the invention to a cell. Thus, it will be appreciated that a 
further aspect of the invention provides a virus or virus-like particle comprising 
a genetic construct of the invention. Other suitable viruses or virus-like 
particles include HSV, AAV, vaccinia and parvovirus. 

In a further embodiment the agent which selectively prevents the function of 
ECSM4 or ECSM1 is a ribozyme capable of cleaving targeted ECSM4 or 
ECSM1 RNA or DNA. A gene expressing said ribozyme may be administered 
in substantially the same and using substantially the same vehicles as for the 
antisense molecules. 

Ribozymes which may be encoded in the genomes of the viruses or virus-like 
particles herein disclosed are described in Cech and Herschlag "Site-specific 
cleavage of single stranded DNA" US 5,180,818; Altman et al "Cleavage of 
targeted RNA by RNAse P" US 5,168,053, Cantin et al "Ribozyme cleavage of 
HIV-1 RNA" US 5,149,796; Cech et al "RNA ribozyme restriction 
endoribonucleases and methods", US 5,116,742; Been et al "RNA ribozyme 
polymerases, dephosphorylases, restriction endonucleases and methods", US 
5,093,246; and Been et al "RNA ribozyme polymerases, dephosphorylases, 
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restriction endoribonucleases and methods; cleaves single-stranded RNA at 
specific site by transesterification" US 4,987,071, all incorporated herein by 
reference. 

It will be appreciated that it may be desirable that the antisense molecule or 
ribozyme is expressed from a cell-specific promoter element. 

The genetic constructs of the invention can be prepared using methods well 
known in the art. 

A further aspect of the invention is a method of screening for a molecule that 
binds to ECSM4 or a suitable variant, fragment or fusion thereof, or a fusion of 
a said fragment or fusion thereof, the method comprising 1) contacting a) the 
ECSM4 polypeptide with b) a test molecule 2) detecting the presence of a 
complex containing the ECSM4 polypeptide and a test molecule, and 
optionally 3) identifying any test molecule bound to the ECSM4 polypeptide. 

Preferably the ECSM4 polypeptide is one as described above in respect of the 
eleventh aspect of the invention. 

In a preferred embodiment, the test molecule is a polypeptide. 

In a further preferred embodiment, the method is used to identify natural 
ligands of ECSM4. Thus, in this embodiment the test molecule includes the 
natural ligand of ECSM4. A particularly useful technique for the identification 
of natural ligands of polypeptide molecules is the yeast two-hybrid technique. 
This technique is well known in the art and relies on binding between a 



WO 02/36771 




PCT/GB01/04906 



110 

molecule and its cognate ligand to bring together two parts of a transcription 
complex (which are fused one to the molecule in question and other to the test 
ligand) which, when together, promote transcription of a reporter gene. 

Hence, a preferred embodiment of this aspect of the invention comprises use of 
the screening method, preferably the yeast two-hybrid system, to identify 
natural ligands of the ECSM4 polypeptide. 

A molecule which is identifiable as binding the ECSM4 polypeptide is a 
further aspect of the invention. 

It will be appreciated that a molecule which binds to ESCM4 may modulate the 
activation of ECSM4. 

Suitable peptide ligands that will bind to ECSM4 may be identified using 
methods known in the art. 

One method, disclosed by Scott and Smith (1990) Science 249, 386-390 and 
Cwirla et al (1990) Proc. Natl Acad. Set USA 87, 6378-6382, involves the 
screening of a vast library of filamentous bacteriophages, such as M13 or fd, each 
member of the library having a different peptide fused to a protein on the surface 
of the bacteriophage. Those members of the library that bind to ECSM4 are 
selected using an iterative binding protocol, and once the phages that bind most 
tightly have been purified, the sequence of the peptide ligands may be determined 
simply by sequencing the DNA encoding the surface protein fusion. Another 
method that can be used is the NovaTope (TM) system commercially available 
from Novagen, Inc., 597 Science Drive, Madison, WI 53711. The method is 
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based on the creation of a library of bacterial clones, each of which stably 
expresses a small peptide derived from a candidate protein in which the ligand is 
believed to reside. The library is screened by standard lift methods using the 
antibody or other binding agent as a probe. Positive clones can be analysed 
directly by DNA sequencing to determine the precise amino acid sequence of the 
ligand. 

Further methods using libraries of beads conjugated to individual species of 
peptides as disclosed by Lam et al (1991) Nature 354, 82-84 or synthetic peptide 
combinatorial libraries as disclosed by Houghten et al (1991) Nature 354, 84-86 
or matrices of individual synthetic peptide sequences on a solid support as 
disclosed by Pirrung et al in US 5143854 may also be used to identify peptide 
ligands. 

It will be appreciated that screening assays which are capable of high 
throughput operation will be particularly preferred. Examples may include cell 
based assays and protein-protein binding assays. An SPA-based (Scintillation 
Proximity Assay; Amersham International) system may be used. For example, 
an assay for identifying a compound capable of modulating the activity of a 
protein kinase may be performed as follows. Beads comprising scintillant and a 
polypeptide that may be phosphorylated may be prepared! The beads may be 
mixed with a sample comprising the protein kinase and 32 P-ATP or 33 P-ATP 
and with the test compound. Conveniently this is done in a 96-well format. 
The plate is then counted using a suitable scintillation counter, using known 
parameters for 32 P or 33 P SPA assays. Only 32 P or 33 P that is in proximity to the 
scintillant, i.e. only that bound to the polypeptide, is detected. Variants of such 
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an assay, for example in which the polypeptide is immobilised on the 
scintillant beads via binding to an antibody, may also be used. 

Other methods of detecting polypeptide/polypeptide interactions include 
5 ultrafiltration with ion spray mass spectroscopy/HPLC methods or other 
physical and analytical methods. Fluorescence Energy Resonance Transfer 
(FRET) methods, for example, well known to those skilled in the art, may be 
used, in which binding of two fluorescent labelled entities may be measured by 
measuring the interaction of the fluorescent labels when in close proximity to 
10 each other. 

Alternative methods of detecting binding of a polypeptide to macromolecules, 
for example DNA, RNA, proteins and phospholipids, include a surface 
plasmon resonance assay, for example as described in Plant et al (1995) Analyt 
15 Biochem 226(2), 342-348. Methods may make use of a polypeptide that is 
labelled, for example with a radioactive or fluorescent label. 

A further method of identifying a compound that is capable of binding to the 
ECSM4 polypeptide is one where the polypeptide is exposed to the compound 

20 and any binding of the compound to the said polypeptide is detected and/or 
measured. The binding constant for the binding of the compound to the 
polypeptide may be determined. Suitable methods for detecting and/or 
measuring (quantifying) the binding of a compound to a polypeptide are well 
known to those skilled in the art and may be performed, for example, using a 

25 method capable of high throughput operation, for example a chip-based 
method. New technology, called VLSIPS™, has. enabled the production of 
extremely small chips that contain hundreds of thousands or more of different 
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molecular probes. These biological chips or arrays have probes arranged in 
arrays, each probe assigned a specific location. Biological chips have been 
produced in which each location has a scale of, for example, ten microns. The 
chips can be used to determine whether target molecules interact with any of 
5 the probes on the chip. After exposing the array to target molecules under 
selected test conditions, scanning devices can examine each location in the 
array and determine whether a target molecule has interacted with the probe at 
that location. 

10 Biological chips or arrays are useful in a variety of screening techniques for 
obtaining information about either the probes or the target molecules. For 
example, a library of peptides can be used as probes to screen for drugs. The 
peptides can be exposed to a receptor, and those probes that bind to the 
receptor can be identified. See US Patent No. 5,874,219 issued 23 February 

15 1999toRavaef a/. 

Another method of targeting proteins that modulate the activity of ECSM4 is 
the yeast two-hybrid system, where the polypeptides of the invention can be 
used to "capture" ECSM4 protein binding proteins. The yeast two-hybrid 
20 system is described in Fields & Song, Nature 340:245-246 (1 989). 

It will be understood that it will be desirable to identify compounds that may 
modulate the activity of the polypeptide in vivo. Thus it will be understood 
that reagents and conditions used in the method may be chosen such that the 
25 interactions between the said and the interacting polypeptide are substantially 
the same as between a said naturally occurring polypeptide and a naturally 
occurring interacting polypeptide in vivo. 
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It will be appreciated that in the method described herein, the ligand may be a 
drug-like compound or lead compound for the development of a drug-like 
compound. 

5 

The term "drug-like compound" is well known to those skilled in the art, and 
may include the meaning of a compound that has characteristics that may make 
it suitable for use in medicine, for example as the active ingredient in a 
medicament. Thus, for example, a drug-like compound may be a molecule that 

10 may be synthesised by the techniques of organic chemistry, less preferably by 
techniques of molecular biology or biochemistry, and is preferably a small 
molecule, which may be of less than 5000 daltons and which may be water- 
soluble. A drug-like compound may additionally exhibit features of selective 
interaction with a particular protein or proteins and be bioavailable and/or able 

15 to penetrate target cellular membranes, but it will be appreciated that these 
features are not essential. 

The term "lead compound" is similarly well known to those skilled in the art, 
and may include the meaning that the compound, whilst not. itself suitable for 
20 use as a drug (for example because it is only weakly potent against its intended 
target, non-selective in its action, unstable, poorly soluble, difficult to 
synthesise or has poor bioavailability) may provide a starting-point for the 
design of other compounds that may have more desirable characteristics. 

25 Alternatively, the methods may be used as "library screening" methods, a term 
well known to those skilled in the art. Thus, for example, the method of the 
invention may be used to detect (and optionally identify) a polynucleotide 
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capable of expressing a polypeptide activator of ECSM4. Aliquots of an 
expression library in a suitable vector may be tested for the ability to give the 
required result, 

5 Hence, an embodiment of this aspect of the invention provides a method of 
identifying a drug-like compound or lead compound for the development of a 
drug-like compound that modulates the activity of the polypeptide ECSM4, the 
method comprising contacting a compound with the polypeptide or a suitable 
variant, fragment, derivative or fusion thereof or a fusion of a variant, fragment 

10 or derivative thereof and determining whether, for example, the enzymic 
activity of the said polypeptide is changed compared to the activity of the said 
polypeptide or said variant, fragment, derivative or fusion thereof or a fusion of 
a variant, fragment or derivative thereof in the absence of said compound. 

15 Preferably, the ECSM4 polypeptide is as described above in respect of the 
eleventh aspect of the invention. 

It will be understood that it will be desirable to identify compounds that may 
modulate the activity of the polypeptide in vivo. Thus it will be understood 
20 that reagents and conditions used in the method may be chosen such that the 
interactions between the said polypeptide and its substrate are substantially the 
same as in vivo. 

In one embodiment, the compound decreases the activity of said polypeptide. 
25 For example, the compound may bind substantially reversibly or substantially 
irreversibly to the active site of said polypeptide. In a further example, the 
compound may bind to a portion of said polypeptide that is not the active site 
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so as to interfere with the binding of the said polypeptide to its ligand. In a 
still further example, the compound may bind to a portion of said polypeptide 
so as to decrease said polypeptide's activity by an allosteric effect. This 
allosteric effect may be an allosteric effect that is involved in the natural 
5 regulation of the said polypeptide's activity, for example in the activation of 
the said polypeptide by an "upstream activator". 

A still further aspect of the invention provides a polynucleotide comprising a 
promoter and/or regulatory portion of any one of the ECSM1 or ECSM4 genes. 

10 

By "ECSM1 or ECSM4 genes" we mean the natural genomic sequence which 
when transcribed is capable of encoding a polypeptide comprising the ECSM1 
or ECSM4 polypeptide sequence as defined herein. The natural genomic 
sequence of the ECSM1 or ECSM4 genes may contain introns. 

15 

The polynucleotide of this aspect of the invention is preferably one which has 
transcriptional promoter activity. A promoter is an expression control element 
formed by a DNA sequence that permits binding of RNA polymerase and 
transcription to occur. Preferably the transcriptional promoter activity is 
20 present in mammalian cells and more preferably the polynucleotide has 
transcriptional promoter activity in endothelial cells. In a preferred 
embodiment, the transcriptional promoter activity is present in endothelial cells 
and not in other cell types. 

25 Preferably, the promoter and/or regulatory portion is one which can direct 
endothelial cell selective expression. 
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Preferably, the promoter or regulatory region of the ECSM4 gene is one which 
is capable of promoting transcription of an operatively-linked coding sequence 
in response to hypoxic conditions. More preferably, the level of transcription 
of the coding sequence is up-regulated in hypoxic conditions compared to the 
5 level of transcription in the absence of hypoxia. By "hypoxic conditions" we 
include the physiological conditions of cancer where the inappropriate cell 
proliferation deprives surrounding tissue of oxygen, cardiac disease where for 
example a vessel occlusion may restrict the delivery of oxygen to certain 
tissues, and tissue necrosis where destruction of vascular tissue cells results in 
10 a reduced supply of oxygen to surrounding tissue and the consequent death of 
that surrounding tissue. Hypoxia is described in more detail in Hockel and 
Vaupel (2001) J. Nat Can. Inst 93: 266-276. 

Hence, in a preferred embodiment, the ECSM4 promoter or regulatory region 
15 is comprised in a vector suitable for use in gene therapy for driving expression 
of a therapeutic gene to treat a hypoxic condition. Preferably, the hypoxic 
condition is cancer or cardiac disease. A "therapeutic gene" may be any gene 
which provides a desired therapeutic effect. 

20 It will be appreciated that use of the said ECSM4 promoter to treat a hypoxic 
condition, for example by gene therapy, is included within the scope of the 
present invention. 

Methods for the determination of the sequence of the promoter region of a gene 
25 are well known in the art. The presence of a promoter region may be 
determined by identification of known motifs, and confirmed by mutational 
analysis of the identified sequence. Preferably, the promoter sequence is 
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located in the region 5kb upstream of the genomic coding region of ECSM1 or 
ECSM4. More preferably, it is located in the region 3kb or 2 kb or 1 kb or 
500bp upstream, and still more preferably it is located within 210 bp of the 
transcription start site. 

Regulatory regions, or transcriptional elements such as enhancers are less 
predictable than promoters in their location relative to a gene. However, many 
motifs indicative of regulatory regions are well characterised and such regions 
affecting the level of transcription of the relevant gene can usually be identified 
on the basis of these motifs. The function of such a region can be 
demonstrated by well-known methods such as mutational analysis and in vitro 
DNA-binding assays including DNA footprinting and gel mobility shift assays. 

Regulatory regions influencing the transcription of the ECSM1 or ECSM4 
genes are likely to be located within the region 20 kb or 10 kb or 7 kb 5 kb or 3 
kb, or more preferably 1 kb 5' upstream of the relevant genomic coding region 
or can be located within introns of the gene. 

Sequence tagged sites and mapping intervals will be helpful in localising 
promoter regions, regulatory regions and physical clones. 

In a further preferred embodiment, the polynucleotide comprising the promoter 
and/or regulatory portion is operatively linked to a polynucleotide encoding a 
polypeptide. Methods for linking promoter polynucleotides to polypeptide 
coding sequences are well known in the art. 
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Preferably the polypeptide is a therapeutic polypeptide. A therapeutic 
polypeptide may be any polypeptide which it is medically useful to express 
selectively in endothelial cells. Examples of such therapeutic polypeptides 
include antiproliferative, immunomodulatory or blood clotting-influencing 
5 factors, or antiproliferative or anti-inflammatory cytokines. They may also 
comprise anti-cancer polypeptides. 

In one embodiment of this aspect of the invention, the polynucleotide is one 
suitable for use in medicine. Thus, the invention includes the polynucleotide 
10 packaged and presented for use in medicine. It will be appreciated that such 
polynucleotides will be especially useful in gene therapy, especially where it is 
desirable to express a therapeutic polypeptide selectively an endothelial cell. It 
is preferred if the polynucleotide is one suitable for use in gene therapy. 

15 Gene therapy may be carried out according to generally accepted methods, for 
example, as described by Friedman, 1991. A virus or plasmid vector (see 
further details below), containing a copy of the gene to be expressed linked to 
expression control elements such as promoters and other regulatory, elements 
influencing transcription of ECSM1 or ECSM4 as described above and capable 

20 of replicating inside endothelial cells, is prepared. Suitable vectors are known, 
such as disclosed in US Patent 5,252,479 and WO 93/07282. The vector is 
then injected into the patient, either locally or systemically. If the transfected 
gene is not permanently incorporated into the genome of each of the targeted 
endothelial cells, the treatment may have to be repeated periodically. 

25 

Gene transfer systems known in the art may be useful in the practice of the 
gene therapy methods of the present invention. These include viral and 
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nonviral transfer methods. A number of viruses have been used as gene 
transfer vectors, including papovaviruses, eg SV40 (Madzak et al, 1992), 
adenovirus (Berkner, 1992; Berkner et al, 1988; Gorziglia and Kapikian, 1992; 
Quantin et al, 1992; Rosenfeld et al, 1992; Wilkinson et al, 1992; Stratford- 
5 Perricaudet et al, 1990), vaccinia virus (Moss, 1992), adeno-associated virus 
(Muzyczka, 1992; Ohi et al, 1990), herpesviruses including HSV and EBV 
(Margolskee, 1992; Johnson et al, 1992; Fink et al, 1992; Breakfield and 
Geller, 1987; Freese et al, 1990), and retroviruses of avian (Brandyopadhyay 
and Temin, 1984; Petropoulos et al, 1992), murine (Miller, 1992; Miller et al, 
10 1985; Sorge et al, 1984; Mann and Baltimore, 1985; Miller et al, 1988), and 
human origin (Shimada et al, 1991; Helseth et al, 1990; Page et al, 1990; 
Buchschacher and Panganiban, 1992). To date most human gene therapy 
protocols have been based on disabled murine retroviruses. 

15 Nonviral gene transfer methods known in the art include chemical techniques 
such as calcium phosphate coprecipitation (Graham and van der Eb, 1973; 
Pellicer et al, 1980); mechanical techniques, for example microinjection 
(Anderson et al, 1980; Gordon et al, 1980; Brinster et al, 1981; Constantini 
and Lacy, 1981); membrane fusion-mediated transfer via liposomes (Feigner et 

20 al, 1987; Wang and Huang, 1989; Kaneda et al, 1989; Stewart et al, 1992; 
Nabel et al, 1990; Lim et al, 1992); and direct DNA uptake and receptor- 
mediated DNA transfer (Wolff et al, 1990; Wu et al, 1991; Zenke et al, 1990; 
Wu et al, 1989b; Wolff et al, 1991; Wagner et al, 1990; Wagner et al, 1991; 
CoUe.net al, 1990; Curiel etal, 1991a; Curiel et al, 1991b). 

25 
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Other suitable systems include the retroviral-adenoviral hybrid system 
described by Feng et al (1997) Nature Biotechnology 15, 866-870, or viral 
systems with targeting ligands such as suitable single chain Fv fragments. 

In an approach which combines biological and physical gene transfer methods, 
plasmid DNA of any size is combined with a polylysine-conjugated antibody 
specific to the adenovirus hexon protein, and the resulting complex is bound to 
an adenovirus vector. The trimolecular complex is then used to infect cells. 
The adenovirus vector permits efficient binding, internalization, and 
degradation of the endosome before the coupled DNA is damaged. 

Liposome/DNA complexes have been shown to be capable of mediating direct 
in vivo gene transfer. While in standard liposome preparations the gene 
transfer process is nonspecific, localized in vivo uptake and expression have 
been reported in tumour deposits, for example, following direct in situ 
administration (Nabel, 1992). 

Gene transfer techniques which target DNA directly to tissues, eg endothelial 
cells, is preferred. Receptor-mediated gene transfer, for example, is 
accomplished by the conjugation of DNA (usually in the form of covalently 
closed supercoiled plasmid) to a protein ligand via polylysine. Ligands are 
chosen on the basis of the presence of the corresponding ligand receptors on 
the cell surface of the target cell/tissue type. In the case of endothelial cells, a 
suitable receptor is ECSM4. These ligand-DNA conjugates can be injected 
directly into the blood if desired and are directed to the target tissue where 
receptor binding and internalization of the DNA-protein complex occurs. To 
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overcome the problem of intracellular destruction of DNA, coinfection with 
adenovirus can be included to disrupt endosome function. 

In the case where replacement gene therapy using a functionally wild-type 
5 gene is used, it may be useful to monitor the treatment by detecting the 
presence of replacement gene mRNA or encoded replacement polypeptide, or 
functional gene product, at various sites in the body, including the endothelial 
cells, blood serum, and bodily secretions/excretions, for example urine. 

10 A further aspect of the present invention provides a method of treating an 
individual with cancer, cardiac disease, a hypoxic condition, endometriosis or 
artherosclerosis comprising administering to the individual a polynucleotide 
according to the invention, which polynucleotide comprises a promoter or 
regulatory region of the invention operatively linked to a polynucleotide 

15 encoding a therapeutic polypeptide. 

A still further aspect of the invention provides a method of modulating 
angipgenesis in an individual comprising administering to the individual a 
polynucleotide according to the invention, which polynucleotide comprises a 
20 promoter or regulatory region of the invention operatively linked to a 
polynucleotide encoding a therapeutic polypeptide or a polynucleotide which is 
capable of expressing ECSM4 or a fragment or variant thereof or which 
comprises an ECSM4 antisense nucleic acid. 

25 The therapeutic polypeptide may be any therapeutic polypeptide which is 
useful in treating the individual. Preferably, the therapeutic polypeptide is any 



WO 02/36771 



PCT/GB01/04906 



123 

one or more of immunomodulatory, anti-cancer, a blood-clotting-influencing 
factor or an antiproliferative or anti-inflammatory cytokine. 

Antisense nucleic acid is discussed in more detail above. Briefly, the function 
5 of an antisense nucleic acid is to inhibit the translation of a specific mRNA to 
which the antisense nucleic acid is complementary and able to hybridise to 
within a cell, at least in part. The design of optimal antisense nucleic acid 
molecules is well known in the art of molecular biology. 

10 The present invention also provides a use of a polynucleotide according to the 
invention, which polynucleotide comprises a promoter or regulatory region of 
the invention operatively linked to a polynucleotide encoding a therapeutic 
polypeptide in the manufacture of a medicament for treating cancer, cardiac 
disease, a hypoxic condition, endometriosis or artherosclerosis. 

15 

The invention will now be described in more detail by reference to the 
following Examples and Figures herein 

Figure 1. 

20 Experimental verification by reverse transcription PCR. Candidate endothelial 
specific genes predicted by the combination of the UniGene/EST screen and 
xProfiler SAGE differential analysis (Table 8) were checked for expression in 
three endothelial and nine non-endothelial cell cultures. Endothelial cultures 
were as follows: HMVEC (human microvascular endothelial cells), HUVEC 

25 (human umbilical vein endothelial cells) confluent culture and HUVEC 
proliferating culture. Non-endothelial cultures were as follows: normal 
endometrial stromal (NES) cells grown in normoxia and NES grown in 
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hypoxia, MDA 453 and MDA 468 breast carcinoma cell lines, HeLa, FEK4 
fibroblasts cultured in normoxia and FEK4 fibroblasts cultured in hypoxia, and 
SW480, HCT116 - two colorectal epithelium cell lines. ECSM1 showed 
complete endothelial specificity, while magic roundabout/ECSM4 was very 
5 strongly preferentially expressed in the endothelium. Interestingly, both these 
novel genes appear more endothelial specific than the benchmark endothelial 
specific gene: von Willebrand factor. 

Figure 2. 

10 Phrap generated contig sequence for ECSM1 and amino acid sequence of the 
translation product. The ESTs used to generate this contig are shown in Table 
10. 

Figure 3. 

15 ECSM4 in vitro transcription/translation. The cDNA coding for full length 
ECSM4 was cloned into pBluescript plasmid vector. Circular and Hindin 
digested plasmid were subjected to in vitro transcription/translation using 
TNT® 17 Quick Coupled Transcription/Translation System (Promega 
Corporation) incorporating 35 S Methionine as per manufacturer's instructions. 

20 The reaction products were resolved by SDS PAGE and visualised by 
autoradiography. The Luciferase plasmid was utilised as a positive control for 
the reaction. The numbers on the left indicate the position of molecular size 
markers for reference. The size of the band denoting ECSM4 is consistent with 
the calculated molecular weight of the polypeptide of 1 1 8 kDa. 

25 
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Figure 4. 

cDNA and computer translation of GenBank AK000805 (human 
ECSM4/magic roundabout). 

5 Figure 5. 

Phrap generated contig sequence for human ECSM4 (magic roundabout) ESTs 
and translation of the encoded polypeptide. The DNA sequence is shown in 
the orientation as if it were a cDNA, which is opposite to that in which it was 
originally generated. The ESTs used to generate the contig are shown in Table 
10 11. Translation start in this sequence is at position 2 of the contig sequence, 
and translation finish is at position 948. 

Figure 6. 

An alignment of the GenBank Accession No AK000805 ("magic.seq") and 
15 Phrap ('lis. 111518") generated nucleic acid sequences of human ECSM4 given 
in Figure 4 and 5. 

Figure 7. 

Mouse ECSM4 contig nucleotide sequence and amino acid sequence. 

20 

Figure 8. 

An alignment of the amino acid sequences of the mouse Robol protein 
("T30805") and human ECSM4 ("magic.pep"). 

25 Figure 9. 

An alignment of the amino acid sequences of mouse Robol protein 
('T30805") and mouse ECSM4 ("mousemagic.pep"). 
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Figure 10. 

An alignment of the amino acid sequences of human ("magic.pep") and mouse 
("mousemagic.pep") ECSM4 proteins. Residues in bold indicate well 
5 conserved sequences. The mouse protein sequence is shown on top and the 
human sequence is below. 

Figure 11. 

Expression of magic roundabout in vitro, (a) Ribonuclease protection 

10 analysis. Top, two probes to different regions (nucleotides 1 to 355 and 
3333 to 3679) of magic roundabout were used in the analysis (shown left and 
right). RNase protection assay was performed with U6 small nuclear RNA 
as control (shown bottom) (Maxwell et al (1999) Nature 399: 271). Human 
cell lines and primary isolates: MRC-5, fibroblast cell line, MCF-7, breast 

15 carcinoma cell line, Neuro, SY-SH-5Y neuroblastoma cell line, HUVEC, 
umbilical vein endothelial isolate, HDMEC, dermal microvascular 
endothelial isolate and HMME2, mammary microvascular endothelial cell 
line. N, normoxia, H, hypoxia, P, proliferating, (b) Western analysis of 
cell lysates. A band at "110 kD corresponds to MR and was stronger in 

20 cells exposed to hypoxia for 18 h. The experiment was repeated twice with 
similar results. Immunoblotting was carried out as described in Brown et al 
(2000) Cancer Res. 60: 6298. Polyclonal rabbit anti-sera was raised against 
the following peptides coupled to keyhole limpet haemocyanin: amino acids 
165-181 (LSQSPGAVPQALVAWRA) and 274-288 

25 (DSVLTPEEVALCLEL) (anti-sera 1) or peptides 311-320 (TYGYISVPTA) 
and 336-351 (KGGVLLCPPRPCLTPT) (anti-sera 2). Both anti-sera gave 
identical results. For western analysis, anti-sera was affinity purified on a 
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"Hi-Trap NHS-activated HP" column (Amersham) to which the peptides 
used to raise anti-sera 1 were coupled. 

Figure 12. 

5 Human ECSM4 full-length cDNA and encoded protein sequence. 
Figure 13. 

Mouse ECSM4 full-length cDNA (MuMR.seq) and encoded protein sequence. 
10 Figure 14. 

Alignment of human ECSM4 (top) and mouse ECSM4 (bottom) amino acid 
sequences. 

Figure 15. 

15 Alignment of human ECSM4 ("HuMR.seq"; top) and mouse ECSM4 
("MuMR.seq"; bottom) cDNA sequences. 

Figure 16. 

In situ hybridisation analysis of human placental tissue using ECSM4 as probe. 
20 A bright field view of lOx magnification of thin section of placental tissue. 
The arrow indicates a large blood vessel. 

Figure 17. 

In situ hybridisation analysis of human placental tissue using ECSM4 as probe. 
25 A higher magnification of the bright-field view of thin section of placental 
tissue shown in Figure 16, focussing on the blood vessel. The arrow points to 
endothelial cells lining the lumen of the vessel. 
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Figure 18. 

In situ hybridisation analysis of human placental tissue using ECSM4 as probe. 
A higher magnification of the thin section of placental tissue shown in Figure 
5 16, focussing on the blood vessel and shown here in dark-field. The arrow 
depicts positive staining of endothelial cells lining the lumen of the vessel. 

Figure 19. 

In situ hybridisation analysis of colorectal liver metastatic tissue using ECSM4 
10 as probe. A bright-field view of a section of colorectal liver metastatic tissue 
magnified with (A) lOx and (B) 20x objective. The area marked by the 
boundary (encircling * A) depicts the normal liver tissue. The arrow in (B) 
shows one of the blood vessels within the metastatic tumour tissue. 

15 Figure 20. 

In situ hybridisation analysis of colorectal liver metastatic tissue using ECSM4 
as a probe. This is a dark field view of a section of colorectal liver metastatic 
tissue magnified with (A) lOx and (B) 20x objective. The area marked by the 
boundary (encircling *) depicts the normal liver tissue. The arrow in (B) 
20 shows one of the blood vessels within the metastatic tumour tissue 
corresponding to the vessel shown in Figure 19B. Expression of ECSM4 is 
restricted to endothelial cells of the tumour blood vessels. Note that there is 
little expression in the surrounding normal tissue (*). 
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phosphatase. The arrows show high levels of expression of ECSM4 restricted 
to the vascular endothelial cells. Note that the surrounding tissue shows little 
staining. Comparison with Figure 22 and 23 shows that the expression of 
ECSM4 colocalises with that of vWF, a known marker for vascular endothelial 
5 cells. 

Figure 25. 

Immunohistochemical analysis of HUVEC cells: von Willibrand Factor (vWF). 
HUVEC cells were immobilised and analysed by immunohistochemistry using 
10 an antibody recognising von Willibrand Factor (a marker for endothelial cells) 
as the primary antibody and visualised using anti-rabbit antibody coupled with 
alkaline phosphatase. The arrows show expression of vWF in a subset of the 
HUVEC cells. 

15 Figure 26. 

Immunohistochemical analysis of HUVEC cells using the antibody MGO-7. 
HUVEC cells were immobilised and analysed by immunohistochemistry 
using MGO-7 antibody (a rabbit polyclonal antibody raised against peptides 
MR 311 and MR 336) as the primary antibody and visualised using and- 
20 rabbit antibody coupled with alkaline phosphatase. The arrows show 
expression of ECSM4 in a subset of the HUVEC cells. Note that the staining 
is localised primarily to the cell surface of the cells. 

Figure 27. Expression of magic roundabout in vivo. 
25 (A) Expression of MR detected by in situ hybridisation in of a placental 
arteriole (a) and venule (b) (left, light field and right, dark field), (c) 
Immunohistochemical staining of magic roundabout in a placental arteriole. 



WO 02/36771 




PCT/GB01/04906 



131 

Left, von Willibrand factor control and right, magic roundabout. (B) 
Expression of MR in tumour endothelium. Ganglioglioma (a) x20 and (b) 
x50. Left, light field; right, dark field. Arrows highlight a vessel running 
diagonally down the section with an erythrocyte within it. Endothelial cells 

5 are strongly positive for MR expression. Papillary bladder carcinoma (c) x20 
and (d) x50. The vascular core of the papilla of the tumour is strongly 
positive, particularly the 'flat' endothelial cells indicated by arrows. A magic 
roundabout antisense in situ probe was generated using T3 polymerase from 
IMAGE EST clone 1912098 (GenBank acc. AI278949). The plasmid was 

10 linearised with Eco RI prior to probe synthesis. In situ analysis was then 
performed as described in Poulsom et al (1998) Eur. J. Histochemistry 
42:121-132. 

Example 1. 

15 In silico cloning of novel endothelial specific genes. 

We describe the use of two independent strategies for differential expression 
analysis combined with experimental verification to identify genes specifically 
or preferentially expressed in vascular endothelium. 

20 

The first strategy was based the EST cluster expression analysis in the human 
UniGene gene index (Schuler et al 9 1997). Recurrent gapped BLAST searches 
(Altschul et al y 1997) were performed at very high stringency against expressed 
sequence tags (ESTs) grouped in two pools. These two pools comprised 
25 endothelial cell and non-endothelial cell libraries derived from dbEST 
(Boguski et al, 1995). The second strategy employed a second datamining 
tool: SAGEmap xProfiler. XProfiler is a freely available on-line tool, which is 
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a part of the NCBFs Cancer Genome Anatomy Project (CGAP) (Strausberg et 
al, 1997, Cole et al 9 1995). While these two approaches alone were producing 
a discouragingly high number of false positives, when both strategies were 
combined, predictions proved exceptionally reliable and two novel candidate 
5 endothelial-specific genes have been identified Full-length cDNAs have been 
identified in sequence databases. Another gene (EST cluster) corresponds to a 
partial cDNA sequence from a large-scale cDNA sequencing project and 
contains a region of similarity to the intracellular domain of human roundabout 
homologue 1 (ROBOl). 

10 

UniGene/EST gene index screen 

A pool of endothelial and a pool of non-endothelial sequences were extracted 
using Sequence Retrieval System (SRS) version 5 from dbEST. The 
endothelial pool consisted of 11,117 ESTs from nine human endothelial 

15 libraries (Table 1). The non-endothelial pool included 173,137 ESTs from 108 
human cell lines and microdissected tumour libraries (Table 2). ESTs were 
extracted from dbEST release April 2000. Multiple FASTA files were 
transformed into a BLAST searchable database using the pressdb programme. 
Table 3 shows the expression status of five known endothelial cell-specific 

20 genes in these two pools. 

Subsequently, the longest, representative sequence in each UniGene cluster 
(UniGene Build #111 May 2000, multiple FASTA file hs.seq.uniq) was 
searched using very high stringency BLAST against these two pools. If such 
25 representative sequence reported no hits, the rest of the sequences belonging to 
the cluster (UniGene multiple-FASTA file hs.seq) were used as BLAST 
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queries. Finally, clusters with no hits in the non-endothelial pool and at least 
one hit in the endothelial pool were selected. 

Optimising the BLAST E-value was crucial for the success of BLAST identity- 
5 level searches. Too high an E-value would result in gene paralogues being 
reported. On the other hand, too low (stringent) an E-parameter would result in 
many false negatives, i.e. true positives would not be reported due to 
sequencing errors in EST data: ESTs are large-scale low-cost single pass 
sequences and have high error rate (Aaronson et al, 1996). In this work an E- 
10 value of 10e-20 was used in searches against non-endothelial EST pool and a 
more stringent 10e-30 value in searches against the smaller endothelial pool. 
These values were deemed optimal after a series of test BLAST searches. 

SAGE data and SAGEmap xProfiler differential analysis 

15 Web-based SAGE library subtraction (SAGEmap xProfiler: 
http://www.ncbi.nlm.nih.gov/SAGE/sagexpsetup.cgi) was utilised as the 
second datamining strategy for the identification of novel endothelial specific 
or preferentially endothelial genes. Two endothelial SAGE libraries 
(SAGEJDukeJHMVEC and SAGEJDukeJHMVEC+VEGF with a total of 

20 110,790 sequences) were compared to twenty-four non-endothelial, cell line 
libraries (full list in Table 4, total of 733,461 sequences). Table 5 shows the 
status of expression of five known endothelial specific genes: von 
Willebrand's factor (vWF), two vascular endothelial growth factor receptors: 
fins-like tyrosine kinase 1 (fltl) and kinase insert domain receptor (KDR), 

25 tyrosine kinase receptor type tie (TIE1) and tyrosine kinase receptor type tek 
(TIE2/TEK), in these two SAGE pools. 
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Combined data gives highly accurate predictions 

Twenty known genes were selected in the UniGene/EST screen (Table 6). 
These genes had no hits in the non-endothelial pool and at least one hit in the 
endothelial pool. The list contained at least four endothelial specific genes: 
5 TIE1, TIE2/TEK, LYVE1 and multimerin, indicating -20% accuracy of 
prediction. Other genes on the list, while certainly preferentially expressed in 
the endothelial cells, might not be endothelial specific. To improve on the 
prediction accuracy we decided to combine UniGene/EST screen with the 
xProfiler SAGE analysis. The xProfiler output consisted of a list of genes with 

10 a ten times higher number of tags in the endothelial than in the non-endothelial 
pool sorted according to the certainty of prediction. A 90% certainty threshold 
was applied to this list. Table 7 shows how data from the two approaches were 
combined. Identity-level BLAST searches were performed on mRNAs (known 
genes) or phrap computed contigs (EST clusters representing novel genes) to 

15 investigate how these genes were represented in the endothelial and non- 
endothelial pool. Subsequent experimental verification by RT-PCR (Figure 1) 
proved that the combined approach was 100% accurate, i.e. genes on the 
xProfiler list which had no matches the non-endothelial EST pool and at least 
one match in the endothelial pool were indeed endothelial specific. 

20 

DISCUSSION 

There have been several reports of computer analysis of tissue 
transcriptosomes. Usually an expression profile is constructed, based on the 
25 number of tags assigned to a given gene or a class of genes (Bernstein et al, 
1996, Welle et al, 1999, Bortoluzzi et al, 2000). An attempt can be made to 
identify tissue-specific transcripts, for example Vasmatzis et al, (1997) 
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described three novel genes expressed exclusively in the prostate by in silico 
subtraction of libraries from the dbEST collection. Purpose made cDNA 
libraries may also be employed. Ten candidate granulocyte-specific genes 
have been identified by extensive sequence analysis of cDNA libraries derived 
5 from granulocytes and eleven other tissue samples, namely a hepatocyte cell 
line, foetal liver, infant liver, adult liver, subcutaneous fat, visceral fat, lung, 
colonic mucosa, keratinocytes, cornea and retina (Itoh et al, 1998). 

An analysis similar to the dbEST-based approach taken by Vasmatzis et al, is 
10 complicated by the fact that endothelial cells are present in all tissues of the 
body and endothelial-ESTs are contaminating all bulk tissue libraries. To 
validate this we used three well-known endothelial specific genes: KDR, 
FLT1, and TIE-2 as queries for BLAST searches against dbEST. Transcripts 
were present in a wide range of tissues with multiple hits in well vascularised 
15 tissues (e.g. placenta, retina), embryonic (liver, spleen) or infant (brain) tissues. 
Additionally, we found that simple subtraction of endothelial EST libraries 
against all other dbEST libraries failed to identify any specific genes (data not 
shown). 

20 Two very different types of expression data resources were used in our 
datamining efforts. The UniGene/EST screen was based on expressed 
sequence tag libraries from dbEST. There are 9 human endothelial libraries in 
the current release of dbEST with a relatively small total number of ESTs: 
~1 1,117- Some well-known endothelial specific genes are not represented in 

25 this dataset (Table 3). This limitation raised our concerns that genes with low 
levels of expression would be overlooked in our analysis. Therefore, we 
utilised another type of computable expression data: CGAP SAGE libraries. 
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SAGE tags are sometimes called small ESTs (usually 10-11 bp in length). 
Their major advantage is that they can be unambiguously located within the 
cDNA: they are immediately adjacent to the most 3' Nlain restriction site. 
Though, there are only two endothelial CGAP SAGE libraries available at the 
5 moment, they contain an impressive total of -1 1 1,000 tags - an approximately 
10 times bigger dataset than the -11,117 sequences in the endothelial EST 
pool. The combined approach proved very accurate (Table 8, Figure 1) when 
verified by RT-PCR. 

10 We report here identification of two novel highly endothelial specific genes: 
endothelial cell-specific molecule 1 (ECSM1 - UniGene entry Hs. 13957) and 
magic roundabout (UniGene entry Hs. 1 1 1 5 1 8). For a comprehensive summary 
of data available on these genes see Table 8. 

15 Our combined datamining approach together with experimental verification is 
a powerful functional genomics tool. This type of analysis can be applied to 
many cell types not just endothelial cells. The challenge of identifying the 
function of discovered genes remains, but bioinformatics tools such as 
structural genomics, or homology and motif searches can offer insights that can 

20 then be verified experimentally. 

In summary, this screening approach has allowed the identification of novel 
endothelial cell specific genes and known genes whose expression was not 
known to be specific to endothelial cells. This identification both advances our 
25 understanding of endothelial cell biology and provides new pharmaceutical 
targets for imaging, diagnosing and treating medical conditions involving the 
endothelium. 
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METHODS 
PERL scripts 

A number of PERL scripts were generated to facilitate large scale sequence 
retrieval, BLAST search submissions, and automatic BLAST output analysis. 

Database sequence retrieval 

Locally stored UniGene files (Build #111, release date May 2000) were used in 
the preparation of this report. The UniGene website can be accessed on the 
URL: ww.ncbi.nlm.nih.gov/UniGene/, and UniGene files can be downloaded 
from the ftp repository: ftp://ncbi.nlm.nih.gov/repository/unigene/. 
Representative sequences for the human subset of UniGene (the longest EST 
within the cluster) are stored in the file Hs.seq.uniq, while all ESTs belonging 
to the cluster are stored in a separate file called Hs.seq. 

Sequences were extracted from the dbEST database accessed locally at the 
HGMP centre using the Sequence Retrieval System (SRS version 5) getz 
command. This was done repeatedly using a PERL script for all the libraries in 
the endothelial and non-endothelial subsets, and sequences were merged into . 
two multiple-FASTA files. 

Selection criteria for non-endothelial EST libraries 



Selection of 108 non-endothelial dbEST libraries was largely manual. Initially 
the list of all available dbEST libraries 
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(http://www.ncbi.nlrn.nih.gov/dbEST/libsJ3y0rg.html was searched using the 
keyword 'cells' and the phrase 'cell line'. While this searched identified most 
of the libraries, additional keywords had to be added for the list to be full: 
'melanocyte', 'macrophage', 'HeLa', 'fibroblast'. In some cases, detailed 
5 library description was consulted to confirm that library is derived from a cell 
line/primary culture. We also added a number of CGAP microdissected 
tumour libraries. For that, Library Browser (available at 
http://www.ncbi.nlm.nih.gov/CGAP/hTGI/lbrow/cgaplb.cgi) was used to 
search for the keyword 'microdissected'. 

10 

UniGene gene index screen 

The UniGene gene transcript index was screened against the EST division of 
GenBank, dbEST. Both UniGene and dbEST were developed at the National 

15 Centre for Biotechnology Information (NCBI). UniGene is a collection of EST 
clusters corresponding to putative unique genes. It currently consists of four 
datasets: human, mouse, rat and zebrafish. The human dataset is comprised of 
approximately 90,000 clusters (UniGene Build #1 1 1 May 2000). By means of 
very high stringency BLAST identity searches, we aimed to identify those 

20 UniGene genes that have transcripts in the endothelial and not in the non- 
endothelial cell-type dbEST libraries. Throughout the project, University of 
Washington blast2 which is a gapped version was used as BLAST 
implementation. The E-value was set to 10e-20 in searches against the non- 
endothelial EST pool and to 10e-30 in searches against the smaller endothelial 

25 pool. 
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While UniGene does not provide consensus sequences for its clusters, the 
longest sequence within the cluster is identified. Thus, this longest 
representative sequence (multiple FASTA file Hs.seq.uniq) was searched using 
very high stringency BLAST against the endothelial and non-endothelial EST 
5 pool. If such representative sequence reported no matches, the rest of the 
sequences belonging to the cluster (UniGene multiple-FASTA file Hs.seq) 
followed as BLAST queries. Finally, clusters with no matches in the non- 
endothelial pool and at least one match in the endothelial pool were selected 
using PERL scripts analysing BLAST textual output. 

10 

xProfiler SAGE subtraction 

xProfiler enables an on-line user to perform a differential comparison of any 
combination of forty seven serial analysis of gene expression (SAGE) libraries 

15 with a total of -2,300,000 SAGE tags using a dedicated statistical algorithm 
(Chen et al, 1998). xProfiler can be accessed on: 

http://www.ncbi.nlm.nih.gov/SAGE/sagexpsetup.cgi. SAGE itself is a 
quantitative expression technology in which genes are identified by typically a 
10 or 1 1 bp sequence tag adjacent to the cDNA's most 3' Main restriction site 

20 . (Velculescu et al, 1995). 

The two available endothelial cell libraries (SAGE_Duke_HMVEC and 
SAGE Duke HMVEC+VEGF) defined pool A and twenty-four (see Table 4 
for list) non-endothelial libraries together built pool B. The approach was 
25 verified by establishing the status of expression of the five reference 
endothelial specific genes in the two SAGE pools (Table 5) using Gene to Tag 
Mapping (http://www.ncbi.nlm.nih.gov/SAGE/SAGEcid.cgi). Subsequently, 
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xProfiler was used to select genes differentially expressed between the pools A 
and B. The xProfiler output consisted of a list of genes with a ten fold 
difference in the number of tags in the endothelial compared to the non- 
endothelial pool sorted according to the certainty of prediction. A 90% 
certainty threshold was applied to this list. 

The other CGAP's on-line differential expression analysis tool, Digital 
Differential Display (DDD), relies on EST expression data (source library info) 
instead of using SAGE tags. We attempted to utilise this tool similarly to 
SAGEmap xProfiler but have been unable to obtain useful results. Five out of 
nine endothelial and sixty-four out of hundred and eight non-endothelial cell 
libraries used in our BLAST-oriented approach were available for on-line 
analysis using DDD (http://www.ncbi.nlm.nih.gov/CGAP/info/ddd.cgi). When 
such analysis was performed the following were fifteen top scoring genes: 
annexin A2, actin gamma 1, ribosomal protein large PO, plasminogen activator 
inhibitor type I, thymosin beta 4, peptidyloprolyl isomerase A, ribosomal 
protein LI 3a, laminin receptor 1 (ribosomal protein SA), eukaryotic translation 
elongation factor 1 alpha 1, vimentin, ferritin heavy polypeptide, ribosomal 
protein L3, ribosomal protein S18, ribosomal protein L19, tumour protein 
translationally-controlled 1. This list was rather surprising, did not include any 
well-known endothelial specific genes, did not have any overlap with SAGE 
results (Table 8), and contained many genes, that in the literature are reported 
to be ubiquitously expressed (ribosomal proteins, actin, vimentin, ferritin). A 
major advantage of our UniGene/EST screen is that instead of relying on 
source library data and fallible EST clustering algorithms it actually performs 
identity-level BLAST comparisons in search of transcripts corresponding to a 
gene. 
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PERL scripts 

A number of PERL scripts were generated to facilitate large scale sequence 
retrieval, BLAST search submissions, and automatic BLAST output analysis. 

5 Experimental verification 

To experimentally verify specificity of expression we used the reverse 
transcription polymerase chain reaction (RT-PCR). RNA was extracted from 
three endothelial and seven non-endothelial cell types Cultured in vitro. 

10 Endothelial cultures were as follows: HMVEC (human microvascular 
endothelial cells), HUVEC (human umbilical vein endothelial cells) confluent 
culture and HUVEC proliferating culture. Non-endothelial cultures were as 
follows: normal endometrial stromal (NES) cells grown in normoxia and NES 
grown in hypoxia, MDA 453 and MDA 468 breast carcinoma cell lines, HeLa, 

15 FEK4 fibroblasts cultured in normoxia and FEK4 fibroblasts cultured in 
hypoxia, and SW480, HCT1 16 - two colorectal epithelium cell lines. 

If a sequence tagged site (STS) was available, dbSTS PCR primers were used 
and cycle conditions suggested in the dbSTS entry followed. Otherwise, 
20 primers were designed using the Primer3 programme. Primers are listed in 
Table 9. 

Tissue culture media, RNA extraction and cDNA synthesis 

Cell-lines were cultured in vitro according to standard tissue culture protocols. 
25 In particular, endothelial media were supplemented with ECGS (endothelial 
cell growth supplement - Sigma), and heparin (Sigma) to promote growth. 
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Total RNA was extracted using the RNeasy Minikit (Qiagen) and cDNA 
synthesised using the Reverse-IT 1 st Strand Synthesis Kit (ABgene). 
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Table 1. 

Nine human endothelial libraries from dbEST 



Human aortic endothelium, 20 sequences, in vifro culture 

Human endothelial cells, 346 sequences, primary isolate 

Human endothelial cell (Y.Mitsui), 3 sequences, in vitro culture 

Stratagene endothelial cell 937223, 7171 sequences, primary isolate 

Aorta endothelial cells, 1245 sequences, primary isolate 

Aorta endothelial cells, TNF treated, 1908 sequences, primary isolate 

Umbilical vein endothelial cells 1, 9 sequences 

HDMEC cDNA library, 1 1 sequences, in vitro culture 

Umbilical vein endothelial cells II, 404 sequences 
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Table 2. 

Non-endothelial dbEST libraries. 



1. Activated T-cells I 

2. Activated T-cells II 

3. Activated T-cells m 

4. Activated T-cells IV 

5. Activated T-cells DC 

6. Activated T-cells V 

7. Activated T-cells VI 

8. Activated T-cells VII 

9. Activated T-cells VIII 

1 0. Activated T-cells X 

1 1 . Activated T-cells XI 

12. Activated T-cells XII 

13. Activated T-cells XX 

14. CAMAlEe cell line I 

15. CAMAlEe cell line II 

16. CCRF-CEM cells, cyclohexamide 
treated I 

17. CdnA library of activated B cell line 
3D5 

18. Chromosome 7 HeLa cDNA Library 

19. Colon carcinoma (Caco-2) cell line I 

20. Colon carcinoma (Caco-2) cell line II 

21. Colon carcinoma (HCC) cell line 

22. Colon carcinoma (HCC) cell line II 

23. HCC cell line (matastasis to liver in 
mouse) 

24. HCC cell line (matastasis to liver in 
mouse) II 

25. HeLa cDNA (T.Noma) 

26. HeLa SRIG (Synthetic retinoids 
induced genes) 

27. Homo sapiens monocyte-derived 
macrophages 

28. HSC172cellsI 

29. HSC172 cells n 
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30. Human 23132 gastric carcinoma cell 
line 

3 1 . Human breast cancer cell line Bcap 
37 

32. Human cell line A431 subclone 

33. Human cell line AGZY-83a 

34. Human cell line PCI-06A 

35. Human cell line PCI-06B 

36. Human cell line SK-N-MC 

37. Human cell line TF-1 (D.L.Ma) 

38. Human exocervical cells (CGLee) 

39. Human fibrosarcoma cell line 
HT1080 

40. Human fibrosarcoma cell line 
HT1080-6TGc5 

41. Human gastric cancer SGC-7901 cell 
line 

42. Human GM-CSF-deprivedTF-1 cell 
line (Liu,Hcmgtao) 

43. Human HeLa (Y.Wang) 

44. Human HeLa cells (M.Lovett) 

45. Human Jurkat cell line mRNA 
(Thiele,K.) 

46. Human K5 62 erythroleukemic cells 

47. Human lung cancer cell line 
A549.A549 

,48. Human nasopharyngeal carcinoma 
cell line HNE1 

49. Human neuroblastoma SK-ER3 cells 
(M.Garnier) 

50. Human newborn melanocytes 
(T.Vogt) ' . 

51. Human pancreatic cancer cell line 
Patu 8988t 

52. Human primary melanocytes mRNA 
(LM.Eisenbarth) 

53. Human promyelocytic HL60 cell line 
(S.Herblot) 

54. Human retina cell line ARPE-19 

55. Human salivary gland cell line HSG 

56. Human White blood cells 
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57. JurkatT-cellsI 

58. JurkatT-cellsII 

59. JurkatT-cellsm 

60. JurkatT-cells V 

61. JurkatT-cells VI 

62. Liver HepG2 cell line. 

63. LNCAPcellsI 

64. Macrophage I 

65. Macrophage II 

66. Macrophage, subtracted (total 
CdNA) 

67. MCF7 cell line 

68. Namalwa B cells I 

69. Namalwa B cells II 

70. NCI_CGAP_Br4 

71. NCI_CGAP_Br5 

72. NCI_CGAP_CLL1 

73. NCI_CGAP_GCB0 

74. NCI_CGAP_GCB1 

75. NCI_CGAP_HN1 

76. NCI_CGAP_HN3 

77. NCI_CGAP_HN4 

78. NCI_CGAP_HSC1 

79. NCI_CGAP_Lil 

80. NCI_CGAP_Li2 

81. NCI_CGAP_Ov5 

82. NCI_CGAP_Ov6 

83. NCI_CGAP_Prl 

84. NCI_CGAP_PrlO 

85. NCI_CGAP_Prll 

86. NCI_CGAP_Prl6 

87. NCI_CGAP_Prl8 

88. NCI_CGAP_Pr2 

89. NCI_CGAP_Pr20 

90. NCI_CGAP_Pr24 

91. NCI_CGAP_Pr25 

92. NCI_CGAP_Pr3 

93. NCI_CGAP_Pr4 

94. NCI_CGAP_Pr4.1 

95. NCI_CGAP_Pr5 

96. NCI_CGAP_Pr6 
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97. NCI_CGAP_Pr7 

98. NCI_CGAP_Pr8 

99. NCI_CGAP_Pr9 

100. Normal Human Trabecular Bone 
Cells 

101. Raji cells, cyclohexamide treated 
I 

102. Retinal pigment epithelium 0041 
cell line 

1 03 . Retinoid treated HeLa cells 

104. Soares melanocyte 2NbHM 

105. Soares_senescent_fibroblasts_Nb 
HSF 

106. Stratagene HeLa cell s3 937216 

107. Supt cells 

108. T, Human adult 
Rhabdomyosarcoma cell-line 
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Table 3. 

Five genes known to be endothelial specific genes in the dbEST pools. 

The number of ESTs in the endothelial pool is relatively small (-11,117) and 
not all known endothelial genes are represented 

Known endothelial specific Hits in the non- Hits in the 

gene endothelial pool endothelial pool 

von Willebrand factor (vWF) 1 27 
fltl VEGF receptor 

KDRVEGF receptor 1 

TIE1 tyrosine kinase — 5 

TIE2/TEK tyrosine kinase — 2 
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Table 4. 

Twenty-four non-endothelial cell SAGE-CGAP libraries. 



SYMBOL 



DESCRIPTION 



SAGE_HCT116 

SAGE_Caco_2 

SAGE_Duke_H392 

SAGE_SW837 

SAGE_RKO 

SAGE_NHA(5th) 

SAGE_ES2-1 

SAGE_OVCA432-2 

SAGE_OV1063-3 

SAGE_Duke_mMi-l 

SAGE_Duke_H341 

SAGE_HOSE_4 
SAGE_OVP-5 
SAGE_LNCaP 
SAGEHMEC-B41 

SAGE MDA453 



Colon, cell line derived from colorectal 
carcinoma 

Colon, colorectal carcinoma cell line 
Brain, Duke glioblastoma multiforme cell line 
Colon, cancer cell line 
Colon, cancer cell line 

Brain, normal human astrocyte cells harvested at 
passage 5 

Ovarian Clear cell carcinoma cell line ES-2, 

poorly differentiated 

Ovary, carcinoma cell line OVCA432 

Ovary, carcinoma cell line OV1063 

Brain, c-myc negative medulloblastoma cell line 

mhh-1 

Brain, c-myc positive medulloblastoma cell line 
H341 

Ovary, normal surface epithelium 

Ovary, pooled cancer cell lines 

Prostate, cell line. Androgen dependent 

Cell culture HMEC-B41 of normal human 

mammary epithelial cells 

Cell line MDA-MB-453 of human breast 

carcinoma 
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A ILL cell line SK-BR-3. Human breast 




adenocarcinoma 


C A /"^T? A T70A A 

SAGE_A2780-9 


Ovary, ovanan cancer cell line A2780 


SAGEJDukeJH247_nor 


Brain, glioblastoma multiforme cell line, H247 


mal 




A T"? T*\ 1 T Tf^ A *"T T T 

AGEJDuke_H247JHyp 


Brain, Duke glioblastoma multiforme cell line, 


oxia 


H247, grown under 1.5% oxygen 


DAGB_Dukejpost_cnsi 


Skin, post-crisis survival fibroblast cell-line 


s_fibroblasts 




SAGE_Duke__precrisis_ 


Skin, large T antigen transformed human 


fibroblasts 


fibroblasts clones 


SAGEA 


Prostate, cancer cell line. Induced with synthetic 




androgen 


SAGEJOSE29-11 


Ovary, surface epithelium line 
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Table 5. 



Five known endothelial specific genes in the CGAP SAGE pools. TIE1 and 
TIE2/TEK have multiple hits in the non-endothelial pool (most in nonnal or 
carcinoma cell lines of ovarian origin). vWF is most endothelial specific 
5 having 80 hits in the endothelial pool and only one hit in the non-endothelial 
pool 

Known endothelial Tags in the non-endothelial Tags in the 
specific gene sage libraries endothelial sage 



libraries 



von Willebrand factor 1 (colon carcinoma cell line) 



80 



(VWF) 
fltl VEGF receptor 

KDR VEGF receptor 1 (IOSE29 ovarian surface 



6 



epithelium cell line) 



TIE1 tyrosine kinase 17 (ovarian tumour and 



27 



normal ovarian epithelium 
cell lines) 



TIE2/TEK tyrosine 4 (ovarian carcinoma and 



2 



kinase glioblastoma multiforme cell 



lines) 
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Table 6. 

Results of the UniGene/EST screen. Twenty known genes were selected in 
the UniGene/EST screen (no hits in the non-endothelial pool and minimum one 
hit in the endothelial pool). At least four of these genes are known endothelial 
5 specific genes: TIE1, TTE2/TEK, LYVE1 and multimerin, indicating ~ 20 % 
prediction accuracy. Other genes, while certainly preferentially expressed in 
the endothelial cells, may not be endothelial specific. 



Description 


UniGene ID 


Endothelial 






hits 


TIE1 receptor endothelial tyrosine kinase 


Hs.78824 


5 


Cytosolic phospholipase A2; involved in the 


Hs.211587 


3 


metabolism of eicosanoids 






Branched chain alpha-ketoacid dehydrogenase 


Hs.1265 


2 


CGMP-dependent protein kinase; cloned from 


Hs.2689 


2 


aorta cDNA, strongly expressed in well 






vascularised tissues like aorta, heart, and uterus 






(Tamuraera/, 1996) 






Lymphatic vessel endothelial hyaluronan 


Hs.17917 


2 


receptor 1 - LYVE1 (Banerji etal, 1999) 






TRAF interacting protein: TNF signalling 


Hs.21254 


2 


pathway 






Multimerin: a very big endothelial specific 


Hs.32934 


2 


protein; binds platelet factor V, can also be 






found in platelets (Hayward et al, 1996) 






Diubiquitin (a member of the ubiquitin family); 


Hs.44532 


2 


reported in dendrytic and B lymphocyte cells; 
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involved in antigen processing; this is first 
evidence that it is also present in endothelial 
cells (Bates etal y 1997) 

Beta-transducin family protein; also a homolog 

of D. melanogaster gene notchless: a novel 

WD40 repeat containing protein that modulates 

Notch signalling activity 

TIE2/TEK receptor endothelial tyrosine kinase 

BCL2 associated X protein (BAX) 

Sepiapterin reductase mRNA 

Retinoic acid receptor beta (RARB) 

ST2 receptor: a homolog of the interleukin 1 

receptor 

Mitogen activated protein kinase 8 (MAPK8) 
ERG gene related to the ETS oncogene 
PP35 similar to E. coli yhdg and R. Capsulatus 
nifR3 

Interphotoreceptor matrix proteoglycan; 

strongly expressed in retina and umbilical cord 

vein (Felbor et al, 1998) 

Methylmalonate semialdehyde dehydrogenase 

gene, 

HTLV-I related endogenous retroviral sequence 



Hs.85570 



Hs.89640 
Hs. 159428 
Hs.160100 
Hs. 17 1495 
Hs.66 

Hs.859 

Hs.45514 

Hs.97627 

Hs. 129882 



Hs. 170008 



Hs.247963 



2 
2 
2 
2 
1 

1 
1 
1 
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Table 7. 

xProfiler differential analysis was combined with data from the 
UniGene/EST screen achieving 100%certainty of prediction. xProfiler's 
output lists genes with 10-times higher number of tags in the endothelial than 

5 in the non-endothelial pool of SAGE-CGAP libraries. Hits corresponding to 
these genes in the endothelial and non-endothelial EST pools were identified 
by identity-level BLAST searches for mRNA (known genes) or phrap 
computed contig sequences (EST clusters representing novel genes). Genes are 
sorted according to the number of hits in the non-endothelial EST pool. 

10 Known and predicted novel endothelial specific genes are in bold. 



Unigene ID 


Gene description 


X profiler 


Hits in 


Hits in 






prediction 


endothelial 


non- 






certainty . 


EST pool 


endothelial 










EST pool 


Hs.13957 


ESTs-ECSMl 


97% 


4 


0 


Hs.111518 


magic roundabout, 


100% 


4 


0 




distant homology to 










human roundabout 1 








Hs.268107 


multimerin 


92% 


5 


0 


Hs.155106 


calcitonin receptor- 


97% 


0 


0 




like receptor activity 










modifying protein 2 








Hs.233955 


ESTs 


96% 


0 


0 


Hs.26530 


serum deprivation 


.94% 


3 


.1 




response 










(phosphatidylserine- 
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binding protein) 
Hs.83213 fatty acid binding 100% 3 

protein 4 

Hs.110802 von Willebrand 100% 25 

factor 

Hs.76206 cadherin 5, VE- 100% 4 

cadherin (vascular 

endothelium) 
Hs.2271 endothelinl 98% 9 

Hs.l 19129 collagen, type IV, 100% 4 
alpha 1 

Hs.78146 platelet/endothelial 99% 18 
cell adhesion 
molecule (CD31 
antigen) • 

Hs.76224 EGF-containing 100% 37 

fibulin-like 
extracellular matrix 
protein 1 

Hs.75511 connective tissue .100% 34 

growth factor 



2 

6 



48 
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Table 9. 

List of primers used in RT-PCR reactions. dbSTS primers were used if a 
UniGene entry contained a sequence tagged site (STS). Otherwise, primers 



were designed using the Primer3 programme. 



dene 


Primers (sequence or GenBank Accession 
tor the STS) 


ECSMl-Hs.13957 


G26129 


Magic roundabout -Hs. 111518 


G14937 


calcitonin receptor-like receptor 
activity modifying 2 


G26129 


Hs.233955 


G21261 


fatty acid binding protem 4 


5'-TGC AGC TTC CTT CTC ACC TT-3' 
5'-TCA CAT CCC CAT TCA CAC TG-3' 


von Willebrand factor 


5'-TGT ACC ATG AGG TTC TCA ATG 
C-3' 

5'-TTA TTG TGG GCT CAG AAG GG- 
3' 


serum deprivation response protein 


G21528 


collagen, type IV, alpha 1 


G07125 


EGF-containing fibulin-like 
extracellular matrix protein 1 


G06992 


connective tissue growth factor 


5'-CAA ATG CTT CCA GGT GAA 
AAA-3* 

5'-CGT TCA AAG CAT GAA ATG OA- 
S' 



5 



Table 10. 

ESTs belonging to ECSM1 contig sequence are as follows: 
10 EST SEQUENCES(30) 

AI540508, cDNAcloneIMAGE:2209821, Uterus, 3'read, 2.1kb 
AI870175, cDNAcloneIMAGE:2424998, Uterus, 3'read, 1.7kb 
AI978643, cDNAcloneIMAGE:2491824, Uterus, 3'read, 1.3kb 
15 AI473856, cDNAcloneIMAGE:2044374, Lymph, 3 'read 

AI037900, cDNAcloneIMAGE:1657707, Wholeembryo, 3'read, 1.2kb 
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AI417620, cDNAcloneIMAGE:21 15082, 3'read, l.Okb 
AA147817, cDNAcloneIMAGE:590062 3 3'read 
AA968592, cDNAclonelMAGE: 1578323, 3'read, 0.7kb 
AW474729, cDNAcloneIMAGE:2853635, Uterus, 3'read 
5 R02352, cDNAcloneIMAGE:124282, 3'read, 0.7kb 
R01889, cDNAclonelMAGE: 124485, 5'read, 0.7kb 
AA446606, cDNAcloneIMAGE:783693, Wholeembryo, 3'read 
R02456, cDNAclonelMAGE: 124282, 5'read, 0.7kb 
T72705, cDNAcloneIMAGE:108686, 5'read, 0.7kb 

10 R01890, cDNAclonelMAGE: 124485, 3'read, 0.7kb 
AA147925, cDNAcloneIMAGE:590014, 5'read 
AI131471, cDNAcloneIMAGE:1709098, Heart, 3'read, 0.6kb 
AA733177, cDNAclone399421, Heart, 3'read 
AI039489, cDNAclonelMAGE: 1658903, Wholeembryo, 3'read, 0.6kb 

15 AI128585, cDNAclonelMAGE: 1691245, Heart, 3 'read, 0.6kb 
AI540506, cDNAcloneIMAGE:2209817, Uterus, 3'read, 0.6kb 
AA894832, cDNAclonelMAGE: 150281 5, Kidney, 3'read, 0.5kb 
AW057578, cDNAcloneIMAGE:2553014, Pooled, 3'read, 0.3kb 
AA729975, cDNAclonelMAGE: 1257976, GermCell, 0.3kb 

20 AI131016, cDNAclonelMAGE: 1706622, Heart, 3'read, 0.2kb 
AA147965, cDNAclonelMAGE: 5 90062, 5'read 
AA446735, cDNAcloneIMAGE:783693, Wholeembryo, 5'read 
AA147867, cDNAcloneIMAGE:590014, 3'read 
AI497866, cDNAcloneIMAGE:2 125892, Pooled, 3'read 

25 T72636, cDNAclonelMAGE: 1 08686, 3 'read, 0.7kb 



Table 11. 

30 ESTs within the magic roundabout sequence: 

EST sequences in magic roundabout (55): 

AI803963, cDNAcloneIMAGE:2069520, 3'read, 0.9kb 

W88669, cDNAcloneIMAGE:417844, 3'read, 0.7kb 

35 All 84863, cDNAclonelMAGE: 1 565500, Pooled, 3 'read, 0.6kb 
AA011319, cDNAcloneIMAGE:359779, Heart, 3'read, 0.6kb 
AA302765, cDNAcloneATCC: 194652, Adipose, 3'read 
AI278949, cDNAclonelMAGE: 19 12098, Colon, 3'read, 0.7kb 
AI265775, cDNAcloneIMAGE:2006542, Ovary, 3'read 

40 AA746200, cDNAclonelMAGE: 1324396, Kidney, 0.5kb 
N78762, cDNAcloneIMAGE:301290, Lung, 3'read 
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AI352263, cDNAcloneEvIAGE: 194063 8, Wholeembryo, 3'read, 0.6kb 
AA630260, cDNAclonelMAGE: 8 5485 5, Lung, 3'read, 0.5kb 
C20950, cDNAclone(no-name), 3'read 
W88875, cDNAcloneIMAGE:417844, 5'read, 0.7kb 
5 AA156022, cDNAcloneIMAGE:590120, 3'read 

N93972, cDNAcloneIMAGE:309369, Lung, 3'read, 1.7kb 
. AI217602, cDNAclonelMAGE: 1732380, Heart, 3'read, 0.5kb 
AW294276, cDNAcloneIMAGE:2726'347, 3'read 
AA010931, cDNAcloneIMAGE:359779, Heart, 5'read, 0.6kb 

10 AA303624, cDNAcloneATCC:l 15215, Aorta, 5'read 
AI366745, cDNAclonelMAGE: 193 5056, 3'read, 0.5kb 
AA327257, cDNAcloneATCC: 127927, Colon, 5'read 
C06489, cDNAclonehbc5849, Pancreas 
BE218677, cDNAcloneIMAGE:3 176164, lung, 3'read 

15 AA335675, cDNAcloneATCC: 137498, Testis, 5'read 
R84975, cDNAclonelMAGE: 180552, Brain, 3'read, 2.1kb 
AI926445, cDNAcloneIMAGE:2459442, Stomach, 3'read, 1.9kb 
H61208, cDNAcloneIMAGE:236318, Ovary, 3'read, 1.9kb 
AA335358, cDNAcloneATCC:137019, Testis, 5'read 

20 AI129190, cDNAclonelMAGE: 1509564, Pooled, 3'read, 0.8kb 
T59188, cDNAcloneIMAGE:74634, Spleen, 5'read, 0.8kb 
T59150, cDNAcloneIMAGE:74634, Spleen, 3'read, 0.8kb 
R53174, cDNAclonelMAGE: 154350, Breast, 5'read, 0.8kb 
AA156150, cDNAcloneIMAGE:590120, 5'read 

25 AA302509, cDNAcloneATCC: 1 14727, Aorta, 5'read 
R99429, cDNAcloneIMAGE:201985, 5'read, 2.4kb 
AI813787, cDNAcloneIMAGE:2421627, Pancreas, 3'read, 1.2kb 
H62113, cDNAcloneIMAGE:236316, Ovary, 5'read, l.Okb 
R16422, cDNAcloneMAGE:129313, 5'read, 0.7kb 

30 T48993, cDNActoneIMAGE:7053 1 , Placenta, 5 'read, 0.6kb 
T05694, cDNAcloneHFBDF 1 3, Brain 
R8453 1 , cDNAclonelMAGE: 1 80 104, Brain, 5 'read, 2.2kb 
AI903080, cDNAclone(no-name), breast 
AI903083, cDNAclone(no-name), breast 

35 AA302764, cDNAcloneATCC: 194652, Adipose, 5'read 
AA341407, cDNAcloneATCC: 143064, Kidney, 5'read 
W16503, cDNAcloneIMAGE:301194, Lung, 5'read 
AW801246, cDNAclone(no-name), uterus 
AW959183, cDNAclone(no-name) 

40 R85924, cDNAclonelMAGE: 1 80 104, Brain, 3 'read, 2.2kb 
AA358843, cDNAcloneATCC: 162953, Lung, 5'read 



WO 02/36771 




PCT/GB01/04906 



168 



BE1 61769, cDNAclone(no-name), head-neck 
W40341, cDNAcloneIMAGE:309369, Lung, 5'read, 1.7kb 
AA876225, cDNAcloneIMAGE:1257188, GermCell, 3'read 
R99441, cDNAcloneIMAGE:202009, 5'read, 2.3kb 
5 W76132, cDNAcloneIMAGE:344982, Heart, 5'read, 1 .4kb, 

Table 12. 



110 ESTs in the mouse magic roundabout cluster (Mm.27782) 

10 

AI427548, cDNAcloneIMAGE:521115, Muscle, 3 'read 

AV022394, cDNAclonell90026N09, 3'read 

BB219221, cDNAcloneA530053H04, 3'read 

AI604803, cDNAcloneIMAGE:388336, Embryo, 3'read 
15 AI504730, cDNAcloneIMAGE:964027, Mammarygland, 3'read 

AI430395, cDNAcloneIMAGE:388336, Embryo, 5'read 

AI181963, cDNAcloneIMAGE:1451626, Liver, 3'read 

AV020471, cDNAclonell90017N14, 3'read 

BB219225, cDNAcloneA530053H12, 3'read 
20 BB224304, cDNAcloneA530086A21, 3'read 

BB527740, cDNAcloneD930042Ml 8, 3 'read 

W66614, cDNAcloneIMAGE:3 88336, Embryo, 5'read 

BB097630, cDNAclone9430060E21, 3'read 

AI152731, cDNAclonelMAGE: 1478 154, Uterus, 5'read 
25 AW742708, cDNAcloneIMAGE:2780289, innerear,170pooled, 3'read 

BB1 18169, cDNAclone9530064M17, 3'read 

AI839154, cDNAcloneUI-M-AO0-ach-e-l 1-0-UI, 3'read 

BB206388, cDNAcloneA430075J10, 3'read 

BB381670, cDNAcloneC230015E01, 3*read 
30 BB199721, cDNAcloneA430017A19, 3'read 

AI593217, cDNAclonelMAGE: 11 77959, Mammarygland, 3'read 

BB21941 1, cDNAcloneA530054L01,.3'read 

BB220744, cDNAcloneA530061M19, 3'read 

BB220944, cDNAcloneA530062O22, 3'read 
35 BB390078, cDNAcloneC230066L23, 3'read 

BB220730, cDNAcloneA530061L13, 3'read 

AI615527, cDNAcloneIMAGE:964027, Mammarygland, 5'read 

AI882477, cDNAclonelMAGE: 1396822, Mammarygland, 5'read 

AV025281, cDNAclonel200012D01, 3'read 
40 BB470462, cDNAcloneD230033L23, 3'read 

BB247620, cDNAcloneA730020G03, 3'read 
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BB555377, cDNAcloneE330019B13, 3'read 

BB5 12960, cDNAcloneD730043I21 

BB400157, cDNAcloneC330017F17, 3'read 

BB320465, cDNAcloneB230385O10, 3'read 
5 BB 1 05670, cDNAclone9430096Hl 0, 3 'read 

BB441462, cDNAcloneD030027Bll, 3'read 

BB 1 37530, cDNAclone9830 142007, 3 'read . 

AA553155, cDNAcloneIMAGE:964027, Mammarygland, 5'read 

BB3 19763, cDNAcloneB230382G07, 3'read 
10 BB451051, cDNAcloneD 130007105, 3'read 

BB504672, cDNAcloneD630049 Jl 1 , 3 'read 

AI429453, cDNAcloneIMAGE:569122, Embryo, 3'read 

BB190585, cDNAcloneA330062J23, 3'read 

BB257082, cDNAcloneA730076M18, 3'read 
15 BB3 86699, cDNAcloneC230047P06, 3 'read 

BB295814, cDNAcloneB130042A09, 3'read 

BB450972, cDNAcloneD 130007A22, 3'read 

AA718562, cDNAclonelMAGE: 11 77959, Mammarygland, 5'read 

BB223775, cDNAcloneA530083Kl 8, 3 'read 
20 AV020555, cDNAclonel 1 9001 8G05, 3 'read 

BB226083, cDNAcloneA530095Kll, 3'read 

BB482105, cDNAcloneD430007O19, 3'read 

BB381671, cDNAcloneC230015E02, 3'read 

BB383758, cDNAcloneC230030C02, 3 'read 
25 BB2575 19, cDNAcloneA730080D13, 3 'read 

BB265667, cDNAcloneA830021I17, 3'read 

BB254777, cDNAcloneA730063K20, 3 'read 

AV240775, cDNAclone4732443Fl 5, 3 'read 

BB3 1 50 1 0, cDNAcloneB230352H04, 3 'read 
30 BB390074, cDNAcloneC230066L16, 3'read 

BB517605, cDNAcloneD830025B17, 3'read 

BB484410, cDNAcloneD430025H01, 3'read 

BB357583, cDNAcloneC030022J01, 3'read 

AV225639, cDNAclone3830431D12, 3'read 
35 BB554921, cDNAcloneE330016A12, 3'read 

BB161650, cDNAcloneAl30061H21, 3'read 

BB106720, cDNAclone9530002M22, 3'read 

BB535465, cDNAcloneE030043P14, 3'read 

BB357738, cDNAcloneC030024B10, 3'read 
40 AV285588,cDNAclone5031411M12 

BB1 88339, cDNAcloneA330048H22, 3'read 
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AV337749, cDNAclone6430404F19, 3 'read 
BB065281, cDNAclone8030443H10, 3'read 
BB 148059, cDNAclone9930104N19, 3'read 
AV252251, cDNAclone4833438P20, 3'read 
5 BB 184506, cDNAcloneA330012J24, 3 'read 
BB522445, cDNAcloneD930007M08, 3'read 
BB520366, cDNAcloneD830041K23, 3'read 
AV127290, cDNAclone2700047J01, 3'read 
BB248651, cDNAcloneA730027F04, 3'read 

10 BB008452, cDNAclone4732482M24, 3 'read 
BB550719, cDNAcloneE230024C07, 3'read 
BB 182033, cDNAcloneA230095N14, 3'read 
BB480258, cDNAcloneD330045D17, 3'read 
BB004855, cDNAclone4732463E03, 3'read 

15 AV379748, cDNAclone9230013A19, 3'read 
BB552137, cDNAcloneE230035B12, 3'read 
BB288263, cDNAcloneIMAGE:3490042, mammary, 5 'read 
BB215681, cDNAcloneA530026Ml 1, 3'read 
BB251356, cDNAcloneA730046B16, 3'read 

20 BB50344 1 , cDNAcloneD630043Fl 0, 3 'read 
BB500571, cDNAcloneD630029E03, 3'read 
BB199833, cDNAcloneA430017K13, 3'read 
BB533549, cDNAcloneE030030K03, 3'read 
BB098399, cDNAclone9430063L18, 3'read 

25 BB21 33 10, cDNAcloneA530009E09, 3 'read 
BB240699, cDNAcloneA630083B14, 3'read 
BB217106, cDNAcloneA530040N24, 3'read 
BB057432, cDNAclone7120459H22, 3'read 
BB2 14645, cDNAcloneA530021N22, 3'read 

30 BB218254, cDNAcloneA530048K12, 3'read 
BB319841, cDNAcloneB230382O06, 3'read 
BB459759, cDNAcloneD130063G22, 3'read 
BB485618, cDNAcloneD430032M09, 3'read 
BB5 17699, cDNAcloneD830025J18, 3'read 

35 BB535595, cDNAcloneE030044M09, 3 'read 
BB536291, cDNAcloneE030049D17, 3'read 
BB552689, cDNAcloneE330001A16, 3'read 
BB552709, cDNAcloneE33C001C16, 3'read 



40 
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Example 2. 

ECSM4 expression is restricted to endothelial cells. 

5 In situ hybridisation (ISH) of tumour and normal tissues showed that the 
expression of ECSM4 is restricted to vascular endothelial cells in adult 
angiogenic vessels only. Analysis of normal tissues showed that expression of 
ECSM4 is detected in human placenta and umbilical cord foetal tissue 10.8 
weeks menstrual age. As shown in Figure 16, ECSM4 expression is highly 

10 specific for the vascular endothelial cells of the blood vessel in placenta. 
Furthermore, expression was absent throughout a number of other normal 
tissues that were analysed, including adult liver, brain cerebrum and large 
vessels, prostate, colon, small bowel, heart, eye (choroid and sclera), ovary, 
stomach, breast and foetal bladder, testis, kidney (15.8 weeks) and foetal heart, 

15 kidney, adrenal, intestine (1 L3 weeks) foetal brain (10.6 weeks) and foetal eye 
(16.5 weeks) (data not shown). 

ISH analysis of colorectal liver metastasis biopsies showed that expression of 
ECSM4 was restricted to vascular endothelial cells of the tumour vessels only 
20 (Figure 17 and 18), No expression was detected in the surrounding normal 
tissue. Furthermore the enhanced expression in the vicinity of the necrotic 
tissues (Figure 18, necrotic tissue is indicated by the bright signal labelled *) is 
indicative and consistent with induction of ECSM4 expression by hypoxia. As 
such, ECSM4 may be a novel hypoxia regulated gene. 

25 

The highly restricted expression pattern of ECSM4 in angiogenic vessels in 
normal and tumour tissues in adult is entirely consistent with the endothelial 
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cell selective pattern of expression determined by the in silico analysis 
described in Example 1 . 

Methods 

5 Blocks of formalin-fixed, paraffin-embedded tissues and tumours were 
obtained from the archives of the Imperial Cancer Research Fund Breast 
Pathology Group at Guys Hospital, London, UK. An antisense riboprobe to 
ECSM4 cDNA was prepared for specific localisation of the ECSM4 mRNA by 
in situ hybridisation. The methods for pretreatment, hybridisation, washing, 

10 and dipping of slides in Ilford K5 for autoradiography has been described 
previously (Poulsom, R., Longcroft,' J. M., Jeffrey, R. E., Rogers, L., and Steel, 
J. H. (1998) Eur. J. Histochem. 42, 121-132). Films were exposed for 7 tol5 
days before developing in Kodak D19 and counterstaining with Giemsa. 
Sections were examined under conventional or reflected light dark-field 

15 conditions (Olympus BH2 with epi-illumination) under a x5, xlO or x20 
objective that allowed individual auto-radiographic silver grains to be seen as 
bright objects on a dark background. 

Example 3. 

20 ECSM4 polypeptide is detected only in endothelial cells. 

Antibodies capable of selectively binding the ECSM4 polypeptide were 
generated and used in immunohistochemistry to demonstrate the presence of 
ECSM4 polypeptide in a range of cell types (Figures 21 to 26). Tissue samples 
25 were prepared by standard techniques in the art of immunohistochemistry. 
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Generation of antibodies recognising ECSM4. 

The peptides MR 165, MR 311 and MR 336 were fused to Keyhole Limpet 
Haemocyanin (KLH) before immunisation of rabbits for production of 
polyclonal antibodies. The antibody MGO-5 was derived from rabbits 
5 immunised with the peptide MR 165, whereas MGO-7 was derived from 
rabbits immunised with a mixture of MR 311 and MR 336. The sequence of 
the peptides used to generated the polyclonal antibodies is shown below with 
their reference wilhin the amino acid sequence of full length human ECSM4 as 
shown in Figure 12. 

10 

MR 165 = LSQSPGAVPQALVAWRA (681-697) 
MR 274 = DSVLTPEEVALCLEL (790-804) 
MR 3 1 1 = TYGYISVPTA (827-836) 
MR 336 = KGGVLLCPPRPCLTPT (852-867) 

15 

Example 4. 

The magic roundabout EST sequence identified in the bioinformatics search for 
endothelial specific transcripts was used to isolate a cDNA of 3800 base pairs 

20 in length from a human heart cDNA library. A screen using gene specific 
primers showed the gene to be present in libraries from heart, adult and foetal 
brain, liver, lung, kidney, muscle, placenta and small intestine but absent from 
peripheral blood leukocytes, spleen and testis. Highest expression was in the 
placental library. Comparison of the magic roundabout sequence to that of 

25 roundabout revealed a transmembrane protein with homology throughout but 
absence of some extracellular domains. Thus, MR has two immunoglobulin 
and two fibronectin domains in the extracellular domain compared to five 
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immunoglobulin and two fibronectin domains in the extracellular domains of 
the neuronal specific roundabouts. A transmembrane domain was identified by 
(i) using the transmembrane predicting software PRED-TMR and (ii) using an 
alignment between human MR and human ROBOl peptide sequences. Both 
5 methods identified the same residues as the transmembrane region of human 
MR as amino acids 468-490. Thus, aa 1-467 are extracellular and aa 491-1007 
are intracellular. The intracellular domain contains a putative proline rich 
region that is homologous to those in roundabout that are thought to couple to 
c-abl (Bashaw et al (2000) Cell 101: 703-715). 

10 

Human SHGC-11739 (GenBank acc. G14646) sequence tagged site (STS) was 
mapped to magic roundabout mRNA in a BLAST dbSTS search. This 
STSmaps to chromosome 11 on the Stanford G3 physical map (region 5647.00 
CR10000 LOD 1.09 bin 129). Nevertheless, much sequence is missing and the 
15 genomic structure is not known. Search of the RIKEN database identified 
murine magic roundabout. The predicted molecular weight for the peptide core 
of human MR was 107,457 kDa. This was confirmed by in vitro translation 
(Figure 3). 

20 Example 5. 

ECSM4 expression is detectable in tumours 

In situ hybridisation was used to characterise expression of ECSM4 in vivo. 
25 Expression of ECSM4 was found to be very restricted (Table 13), with no 
signal detectable in many tissues including neuronal tissue. In contrast, strong 
expression was detected in pacenta and a range of tumours including those of 
the brain, bladder and colonic metastasis to the liver (Figure 27). Expression 
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within tumours was restricted to the tumour vasculature. Immuno- 
histochemical staining of placenta confirmed endothelial specific expression of 
the protein. 

5 A search of CGAP SAGE libraries for ECSM4 detected it only in endothelial 
and tumour libraries (Table 14). This was consistent with in situ hybridisation 
results in the adult showing that expression was restricted to tumour vessels 
(colon metastasis to liver, ganglioglioma, bladder and breast carcinoma). 

10 Table 13 Expression of magic roundabout in human tissue in vivo. 

Expression detected 

Placenta and umbilical cord foetal tissue (10.8 weeks menstrual age) 
Vessels in colorectal liver metastasis, ganglioglioma, bladder and breast 
15 carcinoma. 

Expression not detected 

Adult liver, brain cerebrum and large vessels, prostate, colon, small bowel, 
heart, eye choroid and sclera, ovary, stomach, breast 

20 
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Table 14 CGAP SAGE libraries in which magic roundabout was found 
on the basis of gene to tag mapping 

Library Tags/million Tags 

5 

HDMEC 171 

HDMEC + VEGF 224 

Medulloblastoma 102 

Glioblastoma multiforme 85 

10 Ovary, serous adenocarcinoma 59 

Glioblastoma multiforme, pooled 48 



HDMEC, human dermal microvascular endothelial cells; VEGF, vascular 
endothelial growth factor. 

15 

Example 6. 

Induction of ECSM4 in hypoxic endothelial cells 

Initial RT-PCR detected ECSM4 expression in endothelial but not other cell 
20 lines such as fibroblasts (normal endometial and FEK4), colon carcinoma 
(SW480 and HCT1 16), breast carcinoma (MDA453 and MDA468) and HeLa 
cells. Ribonuclease protection analysis has confirmed and extended this 
(Figure 11a). ECSM4 expression was seen to be restricted to endothelium 
(three different isolates) and absent from fibroblast, carcinoma and neuronal 
25 cells. Induction of ECSM4 in hypoxia in endothelial (but not non-endothelial 
cells) was seen when expression of ECSM4 was analysed using two different 
RNase protection probes. Expression was on average 5.5 and 2.6 fold higher in 
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hypoxia for HUVEC and HDMEC respectively. Western analysis identified a 
weak band of HOkD in human dermal microvascular endothelial cells 
(HDMEC) but absent from the non-endothelial cells types (Figure lib). The 
band was more intense when the HDMEC cells were epxosed to 18 h hyposia, 
5 consistent with ECSM4 being a hypoxically regulated gene. 
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CLAIMS 

1. A compound comprising (i) a moiety which selectively binds the 
polypeptide ECSM4 and (ii) a further moiety. 

2. A compound comprising (i) a moiety which selectively binds the 
polypeptide ECSM1 and (ii) a further moiety. 

3. A compound according to either one of Claims 1 or 2 wherein the 
moiety which selectively binds is an antibody. 

4. A compound according to either one of Claims 1 or 2 wherein the 
moiety which selectively binds is a peptide. 

5. A compound according to any one of Claims 1 to 4 wherein the further 
moiety is a readily detectable moiety. 

6. A compound according to any one of Claims 1 to 4 wherein the further 
moiety is a directly or indirectly cytotoxic moiety. 

7. A compound according to Claim 6 wherein the cytotoxic moiety is a 
directly cytotoxic chemotherapeutic agent. 

8. A compound according to Claim 6 wherein the cytotoxic moiety is a 
directly cytotoxic polypeptide. 
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9. A compound according to Claim 6 wherein the cytotoxic moiety is a 
moiety which is able to convert a relatively non-toxic prodrug into a 
cytotoxic drug. 

10. A compound according to Claim 6 wherein the cytotoxic moiety is a 
radiosensitizer. 

11. A compound according to any one of Claims 1 to 4 wherein the further 
moiety comprises a nucleic acid molecule. 

12. A compound according to Claim 11 wherein the nucleic acid molecule 
is a cytotoxic nucleic acid. 

13. A compound according to Claim 12 wherein the nucleic acid molecule 
encodes a directly or indirectly cytotoxic polypeptide. 

14. A compound according to Claim 12 wherein the nucleic acid molecule 
is directly cytotoxic. 

15. A compound according to Claim 12 wherein the nucleic acid encodes a 
therapeutic polypeptide. 

16. A compound according to Claim 7 wherein the cytotoxic moiety 
comprises a radioactive atom. 
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17. A compound according to Claim 16 wherein the radioactive atom is any 
one of phosphorus-32, iodine- 125, iodine-131, indium-Ill, rhenium- 
186, rhenium-188 or yttrium-90. 

18. A compound according to Claim 5 wherein the readily detectable 
moiety comprises a radioactive atom. 

19. A compound according to Claim 18 wherein the radioactive atom is 
selected from any one of technitium-99m or iodine-123. 

20. A compound according to Claim 5 wherein the readily detectably 
moiety comprises a suitable amount of any one of iodine-123, iodine- 
131, indium-Ill, fluorine-19, carbon-13, nitrogen-15, oxygen-17, 
gadolinium, manganese or iron. 

21. A compound according to either one of Claims 1 or 2 wherein the 
further moiety is able to bind selectively to a directly , or indirectly 
cytotoxic moiety. 

22. A compound according to either one of Claims 1 or 2 wherein the 
further moiety is able to bind selectively to a readily detectable moiety. 

23. A compound according to either one of Claims 1 or 2 wherein the 
selective binding moiety and the further moiety are polypeptides which 
are fused. 

24. A nucleic acid molecule encoding a compound according to Claim 23. 
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5 

26. 



27. 

10 



28. 

15 

29. 
20 30. 



31. 

25 



A method of imaging vascular endothelium in the body of an individual 
the method comprising administering to the individual an effective 
amount of a compound according to Claim 5. 

A method according to Claim 25 wherein the vasculature is 
neovasculature. 

A method of diagnosing or prognosing in an individual a condition 
which involves the vascular endothelium the method comprising 
administering to the individual an effective amount of a compound 
according to Claim 5. 

A method according to Claim 27 further comprising the step of 
detecting the location of the compound in the individual. 

A method according to any one of Claims 25 to 28 wherein the 
individual has cancer. 

A method of treating an individual in need of treatment, the method 
comprising administering to the individual an effective amount of a 
compound according to any one of Claims 1 to 4 or 6 to 17. 

A method according to Claim 30 wherein the individual in need of 
treatment has a proliferative disease or a condition involving the 
vascular endothelium such as any one of cancer, psoriasis, diabetic 
retinopathy, artherosclerosis or menorrhagia. 
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32. A method of introducing genetic material selectively into vascular 
endothelial cells the method comprising contacting the cells with a 
compound according to any one of Claims 11-15. 

33. Use of a compound according to Claim 5 in the manufacture of an agent 
for imaging the vasculature in a body of an individual. 

34. Use of a compound according to Claim 5 in the manufacture of a 
diagnostic or prognostic agent for a condition which involves the 
vascular endothelium. 

35. Use of a compound according to any one of Claims 1 to 4 or 6 to 17 in 
the manufacture of a medicament for treating a condition involving the 
vascular endothelium. 

36. Use according to Claim 33 wherein the vasculature is cancer 
neovasculature or use according to either one of Claims 34 or 35 
wherein the condition is cancer. 

37. A polypeptide comprising or consisting of a fragment or variant or 
fusion of the ECSM4 polypeptide or a fusion of said fragment or variant 
provided that it is not a polypeptide consisting of the amino acid 
sequence given between residues 49 and 466 of Figure 4. 

38. A polypeptide according to Claim 37 comprising or consisting of the 
sequence LSQSPGAWQALVAWRA, DSVLTPEEVALCLEL, 
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TYGYISVPTA, KGGVLLCPPRPCLTPT, WLADTW, 

WLADTWRSTSGSKD, SPPTTYGYIS, 
GSLANGWGSASEDNAASARASLVSSSDGSFLAD or 
FARALAVAVD or a fragment thereof of at least 5 amino acids. 

5 

39. A polypeptide according to Claim 37 comprising or consisting of a 
sequence present in the human ECSM4 but absent from the mouse 
ECSM4 sequence. 

10 40. A polypeptide according to Claim 39 wherein the sequence consists or 
comprises the sequence GGDSLLGGRGSL, 

LLQPPARGHAHDGQALSTDL, EPQDYTEPVE, 
TAPGGQGAPWAEE or ERATQEPSEHGP. 

15 41. A polypeptide comprising or consisting of the ECSM1 polypeptide or a 
fragment or variant or fusion thereof or a fusion of said fragment or 
variant. 

42. A polynucleotide encoding a polypeptide according to any one of 
20 Claims 37 to 40, or the complement thereof or a polynucleotide which 

selectively hybridises to either of these which polynucleotide is not any 
one of the clone or cDNA corresponding to GenBank Accession No 
AK000805 or the ESTs whose GenBank Accession Nos are listed in 
Table 11 or Table 12. 



25 
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5 44. 
10 

45. 
15 46. 
47. 

20 

48. 
49. 

25 

50. 



A polynucleotide according to Claim 42 which encodes a polypeptide 
comprising an amino acid sequence with at least 65% identity to the 
amino acid sequence given in Figure 4 or Figure 7. 

A polynucleotide encoding a polypeptide according to Claim 41 or the 
complement thereof or a polynucleotide which selectively hybridises to 
either of these provided that the polynucleotide is not one present in 
ATCC deposit No 209145 or the clone corresponding to GenBank 
Accession No AC011526 or the ESTs whose GenBank Accession Nos 
are listed in Table 10. 

A polynucleotide according to Claim 45 which encodes a polypeptide 
with at least 90% identity to the amino acid sequence given in Figure 2. 

A polynucleotide according to any one of Claims 42 to 45 which is 
detectably labelled. 

An expression vector comprising a polynucleotide according to any one 
of Claims 42 to 45. 

A recombinant host cell comprising a polynucleotide according to any 
one of Claims 23, 41 to 45 or 47. 

A recombinant host cell according to Claim 48 which is a bacterial cell. 

A recombinant host cell according to Claim 48 which is a mammalian 
cell. 
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51. A method of producing a polypeptide according to any one of Claims 23 
or 37 to 41 the method comprising expressing a polynucleotide 
according to any one of Claims 24, 42 to 45 or 47 or culturing a host 
cell according to any one of Claims 48 to 50. 

52. An antibody capable of selectively binding to the ECSM4 polypeptide 
or the ECSM1 polypeptide. 

53. An antibody according to Claim 52 which selectively binds a 
polypeptide comprising the amino acid sequence located between 
residues 1 and 467 of Figure 12. 

54. An antibody according to Claim 53 which selectively binds any one of 
the amino acid sequences GGDSLLGGRGSL, 
LLQPPARGHAHDGQALSTDL, EPQDYTEPVE, 
TAPGGQGAPWAEE or ERATQEPSEHGP. 

55. An antibody according to Claim 52 which selectively binds a 
polypeptide comprising the amino acid sequence located between 
residues 1 and 467 of Figure 12. 

56. An antibody according to any one of Claims 52 to 54 which is a 
monoclonal antibody. 



57. 



An antibody according to any one of Claims 52 to 55 which is 
detectably labelled. 
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58. A method of detecting endothelial damage or activation in an individual 
comprising obtaining a fluid sample from the individual and detecting 
the presence of fragments ECSM1 or ECSM4 in the sample. 

5 

59. A method of detecting a tumour or tumour neovasculature or cardiac 
disease or endometriosis or artherosclerosis in an individual comprising 
obtaining a fluid sample from the individual and detecting the presence 
of fragments of ECSM1 or ECSM4 in the sample. 

10 

60. A method according to Claim 58 or 59 wherein the detection employs 
an antibody according to any one of Claims 52 to 55 or a compound 
according to any one of Claims 1 to 5, 18-20 or 22. 

15 61. A method according to either one of Claims 58 or 60 wherein 
endothelial damage is diagnostic of cancer, cardiac disease, 
endometriosis or artherosclerosis in the individual. 

62. A method according to any one of Claims 57 to 61 wherein the 
20 individual is one receiving treatment for cancer, cardiac disease, 

endometriosis or artherosclerosis and the amount of fragments of 
ECSM1 or ECSM4 in the sample is determined and compared either to 
that in a sample from an individual who does not have cancer, cardiac 
disease, endometriosis or artherosclerosis and/or to the amount in a 
25 sample from the individual prior to commencement of said treatment 

and the comparison indicates the efficacy of treatment of the individual. 
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63. A method of modulating angiogenesis in an individual, the method 
comprising administering to the individual ECSM4, a peptide or 
fragment of ECSM4 or a ligand of ECSM4 or an antibody which 
selectively binds to ECSM4 or ECSM1. 

5 

64. A method of diagnosing a condition which involves aberrant or 
excessive growth of vascular endothelium in an individual comprising 
obtaining a sample containing nucleic acid from the individual and 
contacting said sample with a polynucleotide which selectively 

10 hybridises to a nucleic acid which encodes the ECSM4 polypeptide or 

the ECSM4 polypeptide. 

65. A method of reducing the expression of the ECSM4 or ECSM1 
polynucleotide in an individual, comprising administering to the 

15 individual an agent which selectively prevents expression of ECSM4 or 

ECSM1. 



66. A method according to Claim 65 wherein the agent is an antisense 
nucleic acid. 

20 

67. A method of screening for a molecule that binds to ECSM4 or a suitable 
variant, fragment or fusion thereof, or a fusion of a said fragment or 
fusion thereof, the method comprising 1) contacting a) the said 
polypeptide with b) a test molecule, 2) detecting the presence of a 

25 complex containing the ECSM4 polypeptide and a test molecule and 

optionally 3) identifying any test molecule bound to the ECSM4 
polypeptide. 
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68. A polynucleotide comprising a promoter and/or regulatory portion of 
either of the ECSM1 or ECSM4 genes. 

5 69. A polynucleotide according to Claim 68 which has transcriptional 
promoter activity selective to endothelial cells. 

70. A polynucleotide according to either one of Claims 68 or 69 which is 
regulated by hypoxic conditions. 



71. A polynucleotide as defined in any one of Claims 67 to 70 operatively 



72. A polynucleotide according to Claim 71 wherein the polypeptide is a 



73. A polynucleotide according to Claim 72 wherein the polypeptide is a 
therapeutic polypeptide suitable for treating a hypoxic condition such as 
cancer, cardiac disease, endometriosis or artherosclerosis. 



74. A polynucleotide according to any one of Claims 68 to 73 which is 
suitable for use in gene therapy. 



10 



linked to a polynucleotide encoding a polypeptide. 



15 



therapeutic polypeptide. 



20 



75.. 

25 



A kit of parts comprising a compound according to Claim 8 and a 
relatively non-toxic prodrug. 
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76. A kit of parts comprising a compound according to Claim 20 and any 
one of a directly or indirectly cytotoxic moiety or a readily detectable 
moiety to which the said compound is able to bind via its further moiety. 



77. A compound according to any one of Claims 1 to 23, or a polypeptide 
according to Claims 37 to 41 or a polynucleotide according to any one 
of Claims 42 to 46 or 68 to 74 or an expression vector according to 
Claim 47 or an antibody according to Claims 52 to 56 for use in 
medicine. 



10 



78. A pharmaceutical composition comprising a polypeptide according to 
Claims 37 to 41 or a polynucleotide according to any one of Claims 42 
to 46 or 68 to 74 or an expression vector according. to Claim 47 or an 
antibody according to Claims 52 to 56 and a pharmaceutically 

15 acceptable carrier 

79. A method of treating an individual with cancer, cardiac disease, a 
hypoxic condition, endometriosis or artherosclerosis comprising 
administering to the individual a polynucleotide according to any one of 

20 Claims 68 to 74. 

80. A method of modulating angiogenesis in an individual comprising 
administering to the individual a polynucleotide according to any one of 
Claims 68 to 74 or a polynucleotide which is capable of expressing 

25 ECSM4 or a fragment or variant thereof or which comprises an ECSM4 

antisense nucleic acid. 
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5 

82. 
10 83. 

15 84. 
85. 

20 

86. 



A method according to Claim 80 when dependent on Claims 72 to 74 
wherein the therapeutic polypeptide is any one or more of 
immunomodulatory, anti-cancer, a blood-clotting-influencing factor or 
an anti-proliferative or anti-inflammatory cytokine. 

Use of a polynucleotide according to any one of Claims 68 to 74 in the 
manufacture of a medicament for treating cancer, cardiac disease, a 
hypoxic condition, endometriosis or artherosclerosis. 

Use of a polynucleotide according to any one of Claims 68 to 74 or a 
polynucleotide which is capable of expressing ECSM4 or a fragment or 
variant thereof or which comprises an ECSM4 antisense nucleic acid in 
the manufacture of a medicament for modulating angiogenesis. 

A transgenic non-human mammal comprising a transgene which 
encodes a polypeptide according to any one of Claims 37 to 41. 

A non-human mammal wherein if it contains an ECSM1 gene or an 
ECSM4 gene the gene or genes are missing or mutated. 

A non-human mammal according to either one of Claims 84 or 85 
which is a rodent, preferably mouse. 
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Figure 1 
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AACTGGTTGCGACACTCCGGTGTTGCACTCTGGCTGCTC 
1 + + + + + _ 60 

NWLRHCGVALWLLLLGTAVC 
ATCCACCGCCX3TCXSCCGAGCTAGGGTGCTTCT^ 
61 + + + + + + 12Q 

IHRRRRARVLLGPGLYRYTS 

GAGGATGCCATCCTAAAACACAGGATGGAT 
121 + + + + + + 180 

EDA I L KHRMDHSDS Q WL ADT 

TGGCGTTCC^CCTCTGGCTCTCGGGACCTC 
181 + + + + + + 240 

WRSTSGS RDLSSSSSLSSRL 

GGGGCGGATGCCCGGGACCCACTAGACTGTCGTCGCTCCTTGCTCTCCTGGGACTCCCGA 
241 + + + + + . _„ + 300 

GADARDPLDCRRSLL SWDSR 
AGCCCCGGCGTGCCCCTGCTTCC^GACACCAGCAC 
301 + + + + + + 36Q 

SPGVPLLPDTSTFYG SLIAE 
CTGCCCTCCAGTACCCCAGCCAGGCCAAGTCCCCAGGTCCCAGCTGTCAGGCGCCTCCCA 
361 + + + + + + 420 

LPSSTPARPSPQVPAVRRLP 
CCCCAGCTGGCCC^GCTCTCCAGCCCCTGCT^ 
421 + + — - + + + + 48Q 

PQ LAQLS- SPCSSSDSLCSRR 

GGACTCTCTTCTCCCCGCTTGTCTCTGGCCCCTGCAGAGGCTTGG 
481 + -+ + + „_ _ + + 54Q 

GLSS PRLSLAPAEAWKAKKK 

CAGGAGCTGCAGCATGCCAACAGTTCCCCACTC 
541 + + --+ + + + 600 

QELQHANSSPLLRGSHSLEL 

CGGGC CTGTGAGTTAGGAAATAGAGGTTCCAAGAACCTTTC 
601 + + + + + + 660 

RACELGNRGSKNLSQSPGAV 
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CCCCAAGCTCTGGTTGCCTGGCGGGCCCT^ 
661 + + + + + + 720 

PQALVAWRALGPKLLSSSNE 

CTGGTTACTCGTCATCTCCCTCCA^ 
721 +--- -+ + +- + 780 

LVTRHLPPAPLFPHETPPTQ 

AGTCAACAGACCCAGCCTCCGGTGGCACCACAGGCTCCCTC 
781 + + + + + + 840 

SQQTQPPVAPQAPSSILLPA 
GCCCCC^TCCCC^TCCTTAGCCCCTGCAGTCCCCCTAGCCCCC^GGCCTCTTCCCTCTCT 
841 + + + + + + 900 

APIPILSPCSPPSPQASSLS 

GGCCCCAGCCCAGCTTCCAGTCGCCTGTCCAGCTCCTC^CTGTCATCCCTGGGGGAGGAT 
901 +— - + + + + + 960 

GPSPASSRLSSSSLSSLGED 

CAAGACAGCGTGCTGACCCCTGAGGAGGTAGCCCTGTGCTTGGAACTCAGTGAGGGTGAG 
961 + + + + + + 1020 

QDSVLTPEEVALCLELSEGE 

GAGACTCCCAGGAACAGCGTCTCTCCCATGC 
1021 + + + + + + 1080 

ETPRNSVSPMPRAPSPPTTY 

GGGTAC^TCAGCGTCCCAACAGCCTC^ 
1081 + + + + + + 1140 

GY I SVPTAS EFTDMGRTGGG 
GTGGGGCCCAAGGGGGGAGTCTTGCTGTGCCCACCTCXXSCCCrGCCTCACCCCCACCCCC 
1141 + + + + + + 1200 

VGPKGGVLLCPPRPCLTPTP 

AGCGAGGGCTCCTTAGCC^TGGTTGGGGCT 
1201 + + + + + + 1260 

SEGSLANGWGSASEDNAASA 

AGAGCCAGCCTTGTCAGCTCCTCCGATGGCTCCTTCCTCGCTGATGCTCACTTTGCCCGG 
1261 + + + + + + 1320 

RASLVSS SDGS FLADAHFAR 
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GCCCTGGCAGTGGCTGTGGATAGCTTTGGTTTCGGTCTAGAGCCCAGGGAGGCAGACTGC 
1321 + + -+ + 1380 

ALAVAVDSFGFGLEPREADC 

GTCTTCATAGGTATGTGAGGTCTCCCCATCTTACTCCTCACTCATGCCCCTTGC 
1381 + + + + + + 1440 

V F I G M * 

AACAACTGTTATCATGTCATCATTGTTAAAAAAAAAAAAAAAAA^^^ 
1441 + + + + + 1496 
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Figure 5 (page 1 of 3) 

ECSM4 Length: 2076 



1 


AGGGGACTCT 


CTTCTCCCCG 


CTTGTCTCTG 


GCCCCTGCAG 


AGGCTTGGAA 


51 


GGCCAAAAAG 


AAAGCAGGAG 


CTGCAGCATG 


CCAACAGTTC 


CCCACTGCTC 


101 


CGGGGCAGCC 


ACTCCTTAGA 


GCTCCGGGCC 


TGTGAGTTAG 


GAAATAGAGG 


151 


TTCCAAGAAC 


CTTTCCCAAA 


GCCCAGGAGC 


TGTGCCCCAA 


GCTCTGGTTG 


201 


CCTGGCGGGC 


CCTGGGACCG 


AAACTCCTCA 


GCTCCTCAAA 


TGAGCTGGTT 


251 


ACTCGTCATC 


TCCCTCCAGC 


ACCCCTCTTT 


CCTCATGAAA 


CTCCCCCAAC 


301 


TCAGAGTCAA 


CAGACCCAGC 


CTCCGGTGGC 


ACCACAGGCT 


CCCTCCTCCA 


351 


TCCTGCTGCC 


AGCAGCCCCC 


ATCCCCATCC 


TTAGCCCCTG 


CAGTCCCCCT 


401 


AGCCCCCAGG 


CCTCTTCCCT 


CTCTGGCCCC 


AGCCCAGCTT 


CCAGTCGCCT 


451 


GTCCAGCTCC 


TCACTGTCAT 


CCCTGGGGGA 


GGATCAAGAC 


AGCGTGCTGA 


501 


CCCCTGAGGA 


GGTAGCCCTG 


TGCTTGGAAC 


TCAGTGAGGG 


TGAGGAGACT 


551 


CCCAGGAACA 


GCGTCTCTCC 


CATGCCAAGG 


GTTCCTTCAC 


CCCCCACCAC 


601 


CTATGGGTAC 


ATCAGCGTCC 


CAACAGCCTC 


AGAGTTCACG 


GACATGGGCA 


.651 


GGACTGGAGG 


AGGGGTGGGG 


CCCAAGGGGG 


GAGTCTTGCT 


GTGCCCACCT 


701 


CGGCCCTGCC 


TCACCCCCAC 


CCCCAGCGAG 


GGCTCCTTAG 


CCAATGGTTG 


751 


GGGCTCAGCC 


TCTGAGGACA 


ATGCCGCCAG 


CGCCAGAGCC 


AGCCTTGTCA 


801 


GCTCCTCCGA 


TGGCTCCTTC 


CTCGCTGATG 


CTCACTTTGC 


CCGGGCCCTG 


851 


GCAGTGGCTG 


TGGATAGCTT 


TGGTTTCGGT 


CTAGAGCCCA 


GGGAGGCAGA 


901 


CTGCGTCTTC 


ATAGATGCCT 


CATCACCTCC 


CTCCCCACGG 


GATTGAGATC 


951 


TTCCTGACCC 


CCAACCTCTC 


CCTGCCCCTG 


TGGGAAGTGG 


AGGCCAGACT 


1001 


GGTTGGAAGA 


CAATGGAAGG 


TCAGCCACAC 


CCAGCGGCTG 


GGAAGGGGGA 


1051 


TGCCTCCCTG 


GCCCCCTGAC 


TCTCAGATCT 


CTTCCCAGAG 


AAGTCAGCTC 


1101 


CACTGTCGTA 


TGCCCAAGGG 


TGGGTGCTTC 


TCCTGTAGAT 


TACTCCTGAA 


1151 


CCGTGTCCCT 


GAGACTTCCC 


AGACGGGAAT 


CAGAACCACT 


TCTCCTGTCC 


1201 


ACCCACAAGA 


CCTGGGCTGT 


GGTGTGTGGG 


TCTTGGCCTG 


TGTTTCTCTG 


1251 


CAGCTGGGGT 


CCACCTTCCC 


AAGCCTCCAG 


AGAGTTCTCC 


CTCCACGATT 


1301 


GTGAAAACAA 


ATGAAAACAA AATTAGAGCA 


AAGCTGTACC 


TGGGAGCCCT 
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1351 


CAGGGAGCAA AACATCATCT 


CCACCTGACT 


CCTAGCCACT 


GCTTTCTCCT 


i4or 


CTGTGCCATC 


CACTCCCACC 


ACCCAGGTTG 


TTTTTGGCCT 


GAAGGAGCAA 


1451 


GCCCTGCCTG 


CTGGCTTTTC 


CCCCCAACCA 


TTTGGGATTC 


ACAGGGAAGT 


1501 


GGGAGGGAGC 


CCAGAGGGTG 


GCCTTTTGTG 


GGAGGGACAG 


CAGTGGCTGC 


1551 


TGGGGGAGAG 


GGCTGTGGAG 


GAAGGAGCTT 


CTCGGAGCCC 


CCTCTCAGCC 


1601 


TTACCTGGGC 


CCCTCCTCTA 


GAGAAGAGCT 


CAACTCTCTC 


CCAACCCTCA 


1651 


CCAATGGAAA 


GAAAATAATT 


ATGAATGCCG 


ACTGAGGCAC 


TGAGGCCCCT 


1701 


ACCTCATGCC 


CAAAACAAAG 


GGGTTCAAGG 


CTGGGTCTAG 


CGAGGATGCT 


1751 


TGAAGGAAGG 


GAGGTATGGA 


GCCCGTAGGT 


CAAAAGCACC 


CATCCTCGTA 


1801 


CTGTTGTCAC 


TATGAGCTTA AGAAATTTGA 


TACCATAAAA 


TGGTAAAGAC 


1851 


TTGAGTTCTG 


TGAGATCATT 


CCCCGGAGCA 


CCATTTTTAG 


GGGAGCACCT 


1901 


GGAGAGATGG 


CAAGAATTTC 


CTGAGTTAGG 


CAGGGATCAG 


GCATTCATTG 


1951 


ACACTCAGGG AGTGTCACAC 


ATTTCTGTTC 


TGCAATTAAA 


GGGAGAATGA 


2001 


GGTTCATCCA CCAAATTTTA AGCAGAATAT AGGAAGGGCA 


GGGGTGGGGA 


2051 


GTTTCAGGGT 


CTGCTGGTCC 


TGGGCA 
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START : 2 



STOP: 948 



Translation: 



Length: 315 



1 
51 
101 
151 
201 
251 
301 



GDSLLPACLW PLQRLGRPKR KQELQHANSS PLLRGSHSLE LRACELGNRG 
SKNLSQSPGA VPQALVAWRA LGPKLLSSSN ELVTRHLPPA PLFPHETPPT 
QSQQTQPPVA PQAPSSILLP AAPIPILSPC SPPSPQASSL SGPSPASSRL 
SSSSLSSLGE DQDSVLTPEE VALCLELSEG EETPRNSVSP MPRVPSPPTT 
YGYISVPTAS EFTDMGRTGG GVGPKGGVLL CP-PRPCLTPT PSEGSLANGW 
GSASEDNAAS ARASLVSSSD GSFLADAHFA RALAVAVDSF GFGLEPREAD 
CVFIDASSPP SPRD* 
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Gap Weight: 50 Average Match: 10.000 

Length Weight: 3 Average Mismatch: 0.000 

Quality: 9397 Length: 2553 

Ratio: 6.281 Gaps: 1 

Percent Similarity: 92.738 Percent Identity: 92.738 

Match display thresholds for the alignment (s) : 
| = IDENTITY 
: = 5 
. = 1 

magic. seq x hs . 111518 . rev September 13, 2000 14:21 



451 TCCAGCTCAGACAGCCTCTGCAGCCGCAGGGGACTCTCTTCTCCCCGCTT 500 

II I I I I I II I II I I I I I I I I I I I 

1 AGGGGACTCTCTTCTCCCCGCTT 23 

501 GTCTCTGGCCCCTGCAGAGGCTTGGAAGGCCAAAAAG . AAGCAGGAGCTG 549 

_ IMIIIIIIIIIIMIIIIMIIIIIMIIIMIIII MINIUM!) 

24 GTCTCTGGCCCCTGCAGAGGCTTGGAAGGCCAAAAAGAAAGCAGGAGCTG 73 
550 CAGCATGCCAA(^GTTCCCCACTGCTCCGGGGCAGCCACTCCTTGGAGCT 599 

JL1XX1XJUUL ■ ■ ■ i ■ ■ ■ 1 1 1 1 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ■ 1 1 inn 

74 CAGCATGCCAACAGTTCCCCACTGCTCCGGGGCAGCCACTCCTTAGAGCT 123 
600 CCGGGCCTGTGAGTTAGGAAATAGAGGTTCCAAGAACCTTTCCCAAAGCC 649 

_ I M 1 1 M 1 1 II II 1 1 1 1 1 1 II II 1 1 1 1 1 II 1 1 1 1 1 M II 1 1 1 1 1 1 1 II II 

124 CCGGGCCTGTGAGTTAGGAAATAGAGGTTCCAAGAACCTTTCCCAAAGCC 173 
650 CAGGGGCTGTGCCCCAAGCTCTGGTTGCCTGGCGGGCCCTGGGACCGAAA 699 

M M M M 1 1 1 1 II II M II II II II II II M M M M II 1 1 1 M M M 

174 CAGGAGCTGTGCCCCAAGCTCTGGTTGCCTGGCGGGCCCTGGGACCGAAA 223 

700 CTCCTCAGCTCCTCAAATGAGCTGGTTACTCGTCATCTCCCTCCAGCACC 749 

III I I II I I M I M I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | 
224 CTCCTCAGCTCCTCAAATGAGCTGGTTACTCGTCATCTCCCTCCAGCACC 273 

750 CCTCTTTCCTCATGAAACTCCCCCAACTCAGAGTCAACAGACCCAGCCTC 799 

„ IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIillllllllll 

274 CCTCTTTCCTCATGAAACTCCCCCAACTCAGAGTCAACAGACCCAGCCTC 323 

«... 

800 CGGTGGCACCACAGGCTCCCTCCTCCATCCTGCTGCCAGCAGCCCCCATC 849 

„ 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 M 1 1 

324 CGGTGGCACCACAGGCTCCCTCCTCCATCCTGCTGCCAGCAGCCCCCATC 373 
850 CCCATCCTTAGCCCCTGCAGTCCCCCTAGCCCCCAGGCCTCTTCCCTCTC 899 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 

374 CCCATCCTTAGCCCCTGCAGTCCCCCTAGCCCCCAGGCCTCTTCCCTCTC 423 



11/56 

SUBSTITUTE SHEET (RULE 26) 



WO 02/36771 




PCT/GB01/04906 



Figure 6 (page 2 of 2) 



900 


TGGCCCCAGCCCAGCTTCCAGTCGCCTGTCCAGCTCCTCACTGTCATCCC 
1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l 1 1 l 1 1 1 1 1 1 1 l 1 l l 1 l l 1 l l 1 1 I I l 


949 


424 


1 II M II ! 1 1 1 II II 1 1 1 1 II II 1 1 II 1 II 1 II II II II 

TGGCCCCAGCCCAGCTTCCAGTCGCCTGTCCAGCTCCTCACTGTCATCCC 


473 


950 


• • • • • 

TGGGGGAGGATCAAGACAGCGTGCTGACCCCTGAGGAGGTAGCCCTGTGC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l l 1 1 1 1 l l 1 1 1 l l l l l l I l l l l I l i I I I l I i i 
1 M 1 1 1 1 1 II II 1 II II II II II II 1 1 II 1 II 1 1 1 1 1 1 1 II 1 II II II 

TGGGGGAGGATCAAGACAGCGTGCTGACCCCTGAGGAGGTAGCCCTGTGC 


999 


474 


523 


1000 


TTGGAACTCAGTGAGGGTGAGGAGACTCCCAGGAACAGCGTCTCTCCCAT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l l 1 l l I I l l l I I 
1 1 1 1 1 M 1 II 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I M I I 

TTGGAACTCAGTGAGGGTGAGGAGACTCCCAGGAACAGCGTCTCTCCCAT 


1049 


524 


573 


1050 


• ■ • s m 

GCCAAGGGCTCCTTCACCCCCCACCACCTATGGGTACATCAGCGTCCCAA 
1 1 1 1 1 1 II II 1 1 1 1 1 1 II ! 1 1 1 11 1 1 1 I 1 II 1 1 1 1 1 II I 1 II 1 1 1 1 1 1 1 

1 1 II II II II h i 1 M 1 1 1 1 ! II 1 1 1 1 II II 1 M 1 1 1 1 1 1 1 1 II 1 II II 

GCCAAGGGTTCCTTCACCCCCCACCACCTATGGGTACATCAGCGTCCCAA 


1099 


574 


623 


1100 


• » ■ • • 

CAGCCTCAGAGTTCACGGACATGGGCAGGACTGGAGGAGGGGTGGGGCCC 
M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 


1149 


624 


1 1 1 1 1 II 1 1 1 1 ! 1 I II 1 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 1 1 II 1 1 1 1 1 1 II II 1 

CAGCCTCAGAGTTCACGGACATGGGCAGGACTGGAGGAGGGGTGGGGCCC 


673 


1150 


• a a • ■ 

AAGGGGGGAGTCTTGCTGTGCCCACCTCGGCCCTGCCTCACCCCCACCCC 

1 1 I 1 I 1 I 1 i 1 I 1 1 I ! 1 t 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 I M 1 1 1 1 1 1 1 I 1 1 I 1 1 1 I 

1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AAGGGGGGAGTCTTGCTGTGCCCACCTCGGCCCTGCCTCACCCCCACCCC 


1199 


674 


723 


1200 


• a a • ' • 

CAGCGAGGGCTCCTTAGCCAATGGTTGGGiSCTCAGCCTCTGAGGACAATG 

II ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 M II 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 

CAGCGAGGGCTCCTTAGCCAATGGTTGGGGCTCAGCCTCTGAGGACAATG 


1249 


724 


773 


1250 


• a a a a 

CCGCCAGCGCCAGAGCCAGCCTTGTCAGCTCCTCCGATGGCTCCTTCCTC 
l i I i I i I 1 L l I i I I 1 I 1 i i i i i i i I I i i i i i i i i i i i i i i i i i i i i i i i i 

1 1 II 1 1 1 1 r 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CCGCCAGCGCCAGAGCCAGCCTTGTCAGCTCCTCCGATGGCTCCTTCCTC 


1299 


774 


823 


1300 


• a a a • 

GCTGATGCTCACTTTGCCCGGGCCCTGGCAGTGGCTGTGGATAGCTTTGG 
1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 i 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 
1 1 II 1 1 II 1 1 M 1 1 1 1 1 1 II 1 II II 1 1 II II 1 1 1 1 II 1 1 1 1 1 1 III 

GCTGATGCTCACTTTGCCCGGGCCCTGGCAGTGGCTGTGGATAGCTTTGG 


1349 


824 


873 


1350 


• a a a • 

TTTCGGTCTAGAGCCCAGGGAGGCAGACTGCGTCTTCATAGGTATGTGAG 


1399 


874 


lllllllllllllllllllllllllllllllllllllllll 1 1 1 

TTTCGGTCTAGAGCCCAGGGAGGCAGACTGCGTCTTCATAGATGCCTCAT 


923 


1400 


• • a a a 

GTCTCCCCATCTTACTCCTCACTCATGCCCCTTGCCTTTCTAACAACTGT 


1449 


924 


1 III 1 II II III II II 

CACCTCCCTCCCCACGGGATTGAGATCTTCCTGACCCCCAACCTCTCCCT 


973 


1450 


• • a a • 

TATCATGTCATCATTGTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAA. . . 


1496 


974 


1 III 1 II M II 1 II II 

GCCCCTGTGGGAAGTGGAGGCCAGACTGGTTGGAAGACAATGGAAGGTCA 


1023 
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Figure 10 

Gap Weight: 8 Average Match: 2.912 

Length Weight: 2 Average Mismatch: -2.003 

Quality: 597 Length: 135 

Ratio: 4.422 Gaps: 0 

Percent Similarity: 87.407 Percent Identity: 85.185 

Match display thresholds for the alignment (s) : 
| = IDENTITY 
: = 2 
. = 1 

mousemagic.pep x magic. pep September 12 , 2000 18:05 



1 E E VALCLELS DGEETPTNS VS PMPRAPS P PTT YGYI S I PTCSGLADMGRA 50 

Minimi 'inn iimimiiiiiimihii i i it i 

280 EEVALCLELSEGEETPRNSVSPMPRAPSPPTTYGYISVPTASEFTDMGRT 329 
51 GGGVGS EVGNLLYP PRPC PTPTPS EGS LANGWGS AS EDNVP S ARASLVS S 100 

Mill • I II IMM llllllllllllllllllll 1 1 Mill II 

330 GGGVGPKGGVLLCPPRPCLTPTPSEGSIANGWGSASEDNAASARASLVSS 379 
101 SDGSFLADTH FARALAVAVDSFGLSLDPREADCVF 135 

MIMIII 1 1 f 1 1 1 1 1 1 1 1 1 1 1 hllllllll 

380 SDGSFLADAH FARALAVAVDSFGFGLEPREADCVF 414 



19/56 



SUBSTITUTE SHEET (RULE 26) 



WO 02/36771 




PCT/GB01/04906 




20/56 

SUBSTITUTE SHEET (RULE 26) 



WO 02/36771 




PCT/GB01/04906 




21/56 



SUBSTITUTE SHEET (RULE 26) 



WO 02/36771 



PCT/GB01/04906 



Figure 12 (page 1 of 4) 

GCGGCCGCGAATTCGGCACGAGCAGCAGGACAAAGTGCTCGGGACAAGGACATAGGGCTG 
1 + + + + + + 6Q 

AGAGTAGCCATGGGCTCTGGAGGAGACAGCCTCCTGGGGGGCAGGGGTTCCCTGCCTCTG 
61 + + + + + + 120 

MGSGGDSLLGGRGS LPL 
CTGCTCCTGCTCATCATGGGAGGCATGGCTCAGGACTCCCCGCCCCAGATCCTAGTCCAC 
121 + + + + + + 180 

LLLLIMGGMAQDSPPQI LVH 
CCCCAGGACCAGCTGTTCCAGGGCCCTGGCCCTGCCAGGATGAGCTGCCAAGCCTCAGGC 

181 + + + + + + 240 

PQ DQL FQGPGPARMSCQASG 

CAGCCACCTCCCACCATCCGCTGGTTGCTGAATGGGCAGCCCCTGAGCATGGTGCCCCCA 
241 + + + + + + 300 

QPPPTIRWLLNGQPLSMVPP 

GACCCACACCACCTCCTGCCTGATGGGACCCTTCTGCTGCTACAGCCCCCTGCCCGGGGA 
301 + + + + + + 360 

DPHHLLPDGTLLLLQPPARG 
CATGCCCACGATGGCCAGGCCCTGTCCACAGACCTGGGTGTCTACACATGTGAGGCCAGC 

361 + + + + + + 420 

HAH DG QALS T DLGVY T C EAS 
AACCGGCTTGGCACGGCAGTCAGCAGAGGCGCTCGGCTGTCTGTGGCTGTCCTCCGGGAG 

421 + + + + + + 480 

NRLGTAVSRGARLSVAVLR E 

GATTTCCAGATCCAGCCTCGGGACATGGTGGCTGTGGTGGGTGAGCAGTTTACTCTGGAA 
481 + + + + + + 54Q 

DFQIQPRDMVAVVGEQFTLE 
TGTGGGCCGCCCTGGGGCCACCCAGAGCCCACAGTCTCATGGTGGAAAGATGGGAAACCC 
541 + + + + + + 600 

CGPPWGHPEPTVSWWKDGKP 

CTGGCCCTCCAGCCCGGAAGGCACACAGTGTCCGGGGGGTCCCTGCTGATGGCAAGAGCA 
601 + + + + + + 66Q 

LALQPGRHTVSGGSLLMARA 
GAGAAGAGTGACGAAGGGACCTACATGTGTGTGGCCACCAACAGCGCAGGACATAGGGAG 

661 + + + + + + 720 

EKS'DE GT YMCVATNSAG HRE 
AGCCGCGCAGCCCGGGTTTCCATCCAGGAGCCCCAGGACTACACGGAGCCTGTGGAGCTT 

721 + + + + + + 780 

SRAARVS IQE PQDYTEPVEL 
CTGGCTGTGCGAATTCAGCTGGAAAATGTGACACTGCTGAACCCGGATCCTGCAGAGGGC 

781 + + + + + + 840 

LAVRI QLENVTLLNPDPAEG 
CCCAAGCCTAGACCGGCGGTGTGGCTCAGCTGGAAGGTCAGTGGCCCTGCTGCGCCTGCC 

841 + + : + + + + 900 

PKPRPAVWLSWKVSGPAAPA 
CAATCTTACACGGCCTTGTTCAGGACCCAGACTGCCCCGGGAGGCCAGGGAGCTCCGTGG 
901 + + + + + 960 

QSYTALFRTQTAPGGQGAPW 
GCAGAGGAGCTGCTGGCCGGCTGGCAGAGCGCAGAGCTTGGAGGCCTCCACTGGGGCCAA 
961 + + + + + + 1020 

AEELLAGWQSAELGGLHWGQ 
GACXACGAGTTCAAAGTGAGACCATCCTCTGGCCGGGCTCGAGGCCCTGACAGCAACGTG 
1021 + + + + + + 108O 

DYEFKVRPSSGRARGP DSNV 
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Figure 12 (page 2 of 4) 

CTGCTCCTGAGGCTGCCGGAAAAAGTGCCCAGTGCCCCACCTCAGGAAGTGACTCTAAAG 

1081 + + + + + + 1140 

LLLRLPEKVPSAPPQEVTLK 
CCTGGCAATGGCACTGTCTTTGTGAGCTGGGTCCCACCACCTGCTGAAAACCACAATGGC 
1141 + + + + + + 1200 

PGNGTVFVSWVPPPAENHNG 
ATCATCCGTGGCTACCAGGTCTGGAGCCTGGGCAACACATCACTGCCACCAGCCAACTGG 
1201 + + + + + + 1260 

IIRGYQVWSLGNTSLPPANW 
ACTGTAGTTGGTGAGCAGACCCAGCTGGAAATCGCCACCCATATGCCAGGCTCCTACTGC 
1261 + + + + + + 1320 

TVVGEQTQLE IATHMPGSYC 
GTGCAAGTGGCTGCAGTCACTGGTGCTGGAGCTGGGGAGCCCAGTAGACCTGTCTGCCTC 
1321 + + + + + + 1380 

VQVAAVTGAGAGEPSRPVCL 
CTTTTAGAGCAGGCCATGGAGCGAGCCACCCAAGAACCCAGTGAGCATGGTCCCTGGACC 
1381 + + + + + + 1440 

LLEQ AMERATQEPS EHG PWT 
CTGGAGCAGCTGAGGGCTACCTTGAAGCGGCCTGAGGTCATTGCCACCTGCGGTGTTGCA 

1441 + + + + + + 1500 

LEQLRATLKRPEVr ATCGVA 
CTCTGGCTGCTGCTTCTGGGCACCGCCGTGTGTATCCACCGCCGGCGCCGAGCTAGGGTG 

1501 -+ + + + + + 1560 

LWLLLLGTAVC IHRRRRARV 
CACCTGGGCCCAGGTCTGTACAGATATACCAGTGAGGATGCCATCCTAAAACACAGGATG 

1561 + + + + + + 1620 

HLGPGLYRYTSEDAILKHRM 
GATCACAGTGACTCCCAGTGGTTGGCAGACACTTGGCGTTCCACCTCTGGCTCTCGGGAC 
1621 + + + + + + 1680 

DHSDSQWLADTWRSTSGSRD 
CTGAGCAGCAGCAGCAGCCTCAGCAGTCGGCTGGGGGCGGATGCCCGGGACCCACTAGAC 

1681 + + + + + + 1740 

LSSSSSLSSRLGADARDPLD 
TGTCGTCGCTCCTTGCTCTCCTGGGACTCCCGAAGCCCCGGCGTGCCCCTGCTTCCAGAC 

1741 + + + + + .+ 1800 

CRRSLLSWDSRSPGVPLLPD 

ACCAGCACTTTTTATGGCTCCCTCATCGCTGAGCTGCCCTCCAGTACCCCAGCCAGGCCA 
1801 + + + + + + 186Q 

TSTFYGSLIAELPSSTPARP 
AGTCCCCAGGTCCCAGCTGTCAGGCGCCTCCCACCCCAGCTGGCCCAGCTCTCCAGCCCC 
1861 + + + + + + 1920 

SPQVPAVRR.LPPQLAQLSSP 
TGTTCCAGCTCAGACAGCCTCTGCAGCCGCAGGGGACTCTCTTCTCCCCGCTTGTCTCTG 
1921 + + + + + + 1980 

CSSSDSLCSRRGLSSPRLSL 
GCCCCTGCAGAGGCTTGGAAGGCCATy^AAGAAGCAGGAGCTGCAGCATGCCAACAGTTCC 
1981 + + + + + + 2040 

AP AEAWKAKKKQELQHANSS 
CCACTGCTCCGGGGCAGCCACTCCTTGGAGCTCCGGGCCTGTGAGTTAGGAAATAGAGGT 

2041 + + + + + + 2100 

PL L RGSHS'L E L R A C E L G NR G 
TCCAAGAACCTTTCCCAAAGCCCAGGAGCTGTGCCCCAAGCTCTGGTTGCCTGGCGGGCC 

2101 + + + + + + 2160 

SKNLSQSPGAVPQALVAWRA 
CTGGGACCGAAACTCCTCAGCTCCTCAAATGAGCTGGTTACTCGTCATCTCCCTCCAGCA 
2161 + + + + + + 2220 

LGPKLLSSSNELVTRHLPPA 
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Figure 12 (page 3 of 4) 



CCCCTCTTTCCTCATGA7UVCTCCCCCAACTCAGAGTCAACAGACCCAGCCTCCGGTGGCA 
2221 + + + + + + 2280 

PLFPHETPPTQSQQTQPPVA 
CCACAGGCTCCCTCCTCCATCCTGCTGCCAGCAGCCCCCATCCCCATCCTTAGCCCCTGC 

2281 + + + + + + 2340 

PQAPSSILLPAAPIPILSPC 
AGTCCCCCTAGCCCCCAGGCCTCTTCCCTCTCTGGCCCCAGCCCAGCTTCCAGTCGCCTG 

2341 + + + + + + 2400 

SPPSPQASS LSGPSPASSRL 
TCCAGCTCCTCACTGTCATCCCTGGGGGAGGATCAAGACAGCGTGCTGACCCCTGAGGAG 

2401 + + + + + + 2460 

SSSSLSSLGEDQDSVLTPEE 
GTAGCCCTGTGCTTGGAACTCAGTGAGGGTGAGGAGACTCCCAGGAACAGCGTCTCTCCC 

2461 + + + + + + 2520 

VALCLELSEGEETPRNSVSP 
ATGCCAAGGGCTCCTTCACCCCCCACCACCTATGGGTACATCAGCGTCCCAACAGCCTCA 

2521 + + + + + + 2580 

MPRAPSPPTTYGYISVPTAS 
GAGTTCACGGACATGGGCAGGACTGGAGGAGGGGTGGGGCCCAAGGGGGGAGTCTTGCTG 

2581 + + + + + + 2640 

EFT DMGRTGGGVGPKGGVLL 
TGCCCACCTCGGCCCTGCCTCACCCCCACCCCCAGCGAGGGCTCCTTAGCCAATGGTTGG 

2641 + + + -f + -— + 2700 

CPPRPCLTPTPSEGSLANGW 
GGCTCAGCCTCTGAGGACAATGCCGCCAGCGCCAGAGCCAGCCTTGTCAGCTCCTCCGAT 

2701 + + + + + + 2760 

GSAS EDNAASARAS LVS SSD 
GGCTCCTTCCTCGCTGATGCTCACTTTGCCCGGGCCCTGGCAGTGGCTGTGGATAGCTTT 

2761 + + + + + 2820 

GS FLADAHFARALAVAVDSF 
GGTTTCGGTCTAGAGCCCAGGGAGGCAGACTGCGTCTTCATAGATGCCTCATCACCTCCC 

2821 + + + + + + 2880 

GFGLEPREADC.VFI DASSPP 
TCCCCACGGGATGAGATCTTCCTGACCCCCAACCTCTCCCTGCCCCTGTGGGAGTGGAGG 

2881 + + + + +— + 2940 

SPRDEI FLTPNLSLPLWEWR 

CCAGACTGGTTGGAAGACATGGAGGTCAGCCACACCCAGCGGCTGGGAAGGGGGATGCCT 
2941 + + + + + + 3000 

PDWLEDMEVSHTQRLGRGMP 
CCCTGGCCCCCTGAACTCTCAGATCTCTTCCCAGAGAAGTCAGCTCCACTGTCGTATGCC 
3001 + + + + + + 3060 

PWPPELSDLFPEKSAPLSYA 
CAAGGCTGGTGCTTCTCCTGTAGATTACTCCTGAACCGTGTCCCTGAGACTTCCCAGACG 
3061 + + + + + + 3120 

QGWCFSCRLLLNRVPETSQT 
GGAATCAGAACCACTTCTCCTGTTCCACCCACAAGACCTGGGCTGTGGTGTGTGGGTCTT 
3121 -+ + + + + 318O 

GIRTTSPVPPTRPGLWCVGL 
GGCCTGTGTTTCTCTGCAGCTGGGGTCCACCTTCCCAAGCCTCCAGAGAGTTCTCCCTCC 
3181 + + + + + + 3240 

GLCFSAAGVHLPKPPESSPS 
ACGATTGTGAAAACAAATGAAAACAAAATTAGAGCAAAGCTGACCTGGAGCCCTCAGGGA 
3241 + + + + + + 3300 

TIVKTNENKIRAKLTWSPQG 
GCAAAACATCATCTCCACCTGACTCCTAGCCACTGCTTTCTCCTCTGTGCCATCCACTCC 
3301 + + ■ + + + + 3360 



AKHHLHLTPSHCFLLCAIHS 
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CACCACCAGGTTGTTTTGGCCTGAGGAGCAGCCCTGCCTGCTGCTCTTCCCCCACCATTT 

3361 + + + + + + 3420 

HHQVVLA* 

GGATCACAGGAAGTGGAGGAGCCAGAGGTGCCTTTGTGGAGGACAGCAGTGGCTGCTGGG 
3421 + + + + + + 3480 

AGAGGGCTGTGGAGGAAGGAGCTTCTCGGAGCCCCCTCTCAGCCTTACCTGGGCCCCTCC 
3481 + + + + + + 3540 

TCTAGAGAAGAGCTCAACTCTCTCCCAACCTCACCATGGAAAGAAAATAATTATGAATGC 
3541 + + + " + + + 3600 

CACTGAGGCACTGAGGCCCTACCTCATGCCAAACAAAGGGTTCAAGGCTGGGTCTAGCGA 
3601 + + + + + + 3660 

GGATGCTGAAGGAAGGGAGGTATGAGACCCGTAGGTCAAAAGCACCATCCTCGTA 
3661 + + + + + 3715 
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(Linear) MAP of: /home/lif /icrt/mehtar/MuMR. seq check: 370 from: 1 
to: 3688 

REFORMAT of: MuMR.seq check: 370 from: 1 to: 3688 February 16, 
2001 14:25 
(No documentation) 

February 16, 2001 15:01 
(Linear) MAP of: /home/lif /icrt/mehtar/MuMR. seq check: 370 from: 1 
to: 3688 

REFORMAT of: MuMR.seq check: 370 from: 1 to: 3688 February 16, 
2001 14:25 
(No documentation) 

February 16, 2001 15:01 

agtgtatgggacaaggagaggagccgagagcagccatgggctctggaggaacgggcctcc 
1 + + + + + + 60 

tcacataccctgttcctctcctcggctctcgtcggtacccgagacctccttgcccggagg 

c MGQGEEPRAAMGSGGTG'LL- 

tggggacggagtggcctctgcctctgctgctgcttttcatcatgggaggtgaggctctgg 
61 + + + + + + 120 

acccctgcctcaccggagacggagacgacgacgaaaagtagtaccctccactccgagacc 

C GTEWPLPLLLLFIMGGEALD- 

attctccaccccagatcctagttcacccccaggaccagctacttcagggctctggcccag 
121 + + + + + + 180 

taagaggtggggtctaggatcaagtgggggtcctggtcgatgaagtcccgagaccgggtc 

C SPPQILVHPQDQLLQGSGPA- 

ccaagatgaggtgcagatcatccggccaaccacctcccactatccgctggctgctgaatg 
181 + ' + + + + + 240 

ggttctactccacgtctagtaggccggttggtggagggtgataggcgaccgacgacttac 

C KMRCRSSGQPPPTIRWLL,NG- 

ggcagcccctcagcatggccaccccagacctacattaccttttgccggatgggaccctcc 
241 + + + + + + 300 

ccgtcggggagtcgtaccggtggggtctggatgtaatggaaaacggcctaccctgggagg 
c Q PLSMAT PD LHYLLPDGT LL- 

tgttacatcggccctctgtccagggacggccacaagatgaccagaacatcctctcagcaa 
301 + + + + + + 36Q 

acaatgtagccgggagacaggtccctgccggtgttctactggtcttgtaggagagtcgtt 

C LHRPSVQGRPQDDQNILSAl- 

tcctgggtgtctacacatgtgaggccagcaaccggctgggcacagcagtgagccggggtg 
361 + + + + + + 42Q 

aggacccacagatgtgtacactccggtcgttggccgacccgtgtcgtcactcggccccac 
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C LGVYTCEASNRLGTAV S R G A - 

ctaggctgtctgtggctgtcctccaggaggacttccagatccaacctcgggacacagtgg 

421 + + + + + + 480 

gatccgacagacaccgacaggaggtcctcctgaaggtctaggttggagccctgtgtcacc 

C RLSVAVLQEDFQIQPRDTVA- 

ccgtggtgggagagagcttggttcttgagtgtggtcctccctggggctacccaaaaccct 
481 + + + + + + 540 

ggcaccaccctctctcgaaccaagaactcacaccaggagggaccccgatgggttttggga 

c VVGESLVLECGPPWGYPKPS- 

cggtctcatggtggaaagacgggaaacccctggtcctccagccagggaggcgcacagtat 

541 : + + + + + + 600 

gccagagtaccacctttctgccctttggggaccaggaggtcggtccctccgcgtgtcata 

C VSWWKDGKPLVLQPGRRTVS- 

ctggggattccctgatggtgtcaagagcagagaagaa.tgactcggggacctatatgtgta 
601 + + + + + + 660 

gacccctaagggactaccacagttctcgtctcttcttactgagcccctggatatacacat 

C GDSLMVSRAEKNDSGTYMCM- 

tggccaccaacaatgctgggcaacgggagagccgagcagccagggtgtctatccaggaat 
661 + + + ■ + + + 720 

accggtggttgttacgacccgttgccctctcggctcgtcggtcccacagataggtcctta 

C ATNNAGQRESRAARVS IQES- 

cccaggaccacaaggaacatctagagcttctggctgttcgcattcagctggaaaatgtga 
721 4- + + + + + 780 

gggtcctggtgttccttgtagatctcgaagaccgacaagcgtaagtcgaccttttacact 

C QDHKEHLELLAVRIQLENVT- 

ccctgctaaaccccgaacctgtaaaaggtcccaagcctgggccatccgtgtggctcagct 
781 + + + + + + 840 

gggacgatttggggcttggacattttccagggttcggacccggtaggcacaccgagtcga 

C LLNPEPVKGPKPGPSVWLSW- 

ggaaggtgagcggccctgctgcacctgctgagtcatacacagctctgttcaggactcaga 
841 + + + + + + 900 

ccttccactcgccgggacgacgtggacgactcagtatgtgtcgagacaagtcctgagtct 

c KVSGPAAPAESYTALFRTQR- 

ggtcccccagggaccaaggatctccatggacagaggtgctgctgcgtggcttgcagagtg 
901 + + + + + + g60 

ccagggggtccctggttcctagaggtacctgtctccacgacgacgcaccgaacgtctcac 
C SPRDQGSPWTEVLLRGLQSA- 
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caaagcttgggggtctccactggggccaagactatgaattcaaagtgagaccgtcctccg 

961 + + + + + + 1020 

gtttcgaacccccagaggtgaccccggttctgatacttaagtttcactctggcaggaggc 

C KLGGLHWGQDYEFKVRPSSG- 

gccgggctcgaggccctgacagcaatgtgttgctcctgaggctgcctgaacaggtgccca 

1021 + + + + + + 1080 

cggcccgagctccgggactgtcgttacacaacgaggactccgacggacttgtccacgggt 

c RARGPDSNVLLLRL P E QV P S - 

gtgccccacctcaaggagtgaccttaagatctggcaacggtagtgtctttgtgagttggg 
1081 + + + + + + H40 

cacggggtggagttcctcactggaattctagaccgttgccatcacagaaacactcaaccc 

c APPQGVTLRSGNGSVFVSWA- 

ctccaccacctgctgaaagccataatggtgtcatccgtggttaccaggtctggagcctgg 

1141 + + + + + + 1200 

gaggtggtggacgactttcggtattaccacagtaggcaccaatggtccagacctcggacc 

C PPPAESHNGVIRGYQVWSLG- 

gcaatgcctcattgcctgctgccaactggaccgtagtgggtgaacagacccagctggaga 

1201 + + + + + + 1260 

cgttacggagtaacggacgacggttgacctggcatcacccacttgtctgggtcgacctct 

C NASLPAANWTVVGEQTQLEI- 

tcgccacacgactgccaggctcctattgtgtgcaagtggctgcagtcactggagctggtg 

1261 + + + + + + 1320 

agcggtgtgctgacggtccgaggataacacacgttcaccgacgtcagtgacctcgaccac 

C ATRL PG SYCVQVAAVT GAGA- 

ctggagaactcagtacccctgtctgcctccttttagagcaggccatggagcaatcagcac 

1321 + + + + + + 1380 

gacctcttgagtcatggggacagacggaggaaaatctcgtccggtacctcgttagtcgtg 

C GELS TPVCLLLEQAME QSAR- 

gagaccccaggaaacatgttccctggaccctggaacagctgagggccaccttgagacgac 
1381 + + + + + + 1440 

ctctggggtcctttgtacaagggacctgggaccttgtcgactcccggtggaactctgctg 
C DPRKH V PWTLEQLRAT LRRP- 

cagaagtcattgccagtagtgctgtcctactctggttgctgctactaggcattactgtgt 
1441 + + + + 4 + 1500 

gtcttcagtaacggtcatcacgacaggatgagaccaacgacgatgatccgtaatgacaca 

c EVIAS SAVLLWLLLLGITVC- 

gtatctacagacgacgcaaagctggggtgcacctgggcccaggtctgtacagatacacca 
1501 + + + + + + i5 6 o 

catagatgtctgctgcgtttcgaccccacgtggacccgggtccagacatgtctatgtggt 
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C IYRRRKAGVHLGPGLYRYTS- 

gcgaggacgccattctaaaacacaggatggaccacagtgactccccatggctggcagaca 

1561 + + + + + + 1620 

cgctcctgcggtaagattttgtgtcctacctggtgtcactgaggggtaccgaccgtctgt 

C EDAILKHRMDHSDSPWLADT- 

cctggcgttccacctctggctctcgagacctgagcagcagcagcagccttagtagtcggc 

1621 + + + + + + i>680 

ggaccgcaaggtggagaccgagagctctggactcgtcgtcgtcgtcggaatcatcagccg 

c WRSTSGSRDLSSSSSLSSRL- 

tgggattggaccctcgggacccactagagggcaggcgctccttgatctcctgggaccctc 

1681 + + + + + + 1740 

accctaacctgggagccctgggtgatctcccgtccgcgaggaactagaggaccctgggag 

C GLD PRDPLEGRRSLISWDPR- 

ggagccccggtgtacccctgcttccagacacgagcacgttttacggctccctcattgcag 
1741 + +- + + + + 1800 

cctcggggccacatggggacgaaggtctgtgctcgtgcaaaatgbcgagggagtaacgtc 

c SPGVPLLPDTSTFYGS LIAE- 

agcagccttccagccctccagtccggccaagccccaagacaccagctgctaggcgctttc 

1801 + + + + + + i860 

tcgtcggaaggtcgggaggtcaggccggttcggggttctgtggtcgacgatccgcgaaag 

C QPSSPPVRPSPKTPAARRFP- 

catccaagttggctggaacctccagcccctgggctagctcagatagtctctgcagccgca 
1861 + + + + + + 1920 

gtaggttcaaccgaccttggaggtcggggacccgatcgagtctatcagagacgtcggcgt 

C SKLAGTSSPWASSDSLCSRR- 

ggggactctgttccccacgcatgtctctgacccctacagaggcttggaaggccaaaaaga 
1921 + + + + + + 198O 

cccctgagacaaggggtgcgtacagagactggggatgtctccgaaccttccggtttttct 
c GLC S PRMSLT PTEAWKAKKK- 

agcaggaattgcaccaagctaacagctccccactgctccggggcagccaccccatggaaa 
1981 + + + + + + 2040 

tcgtccttaacgtggttcgattgtcgaggggtgacgaggccccgtcggtggggtaccttt 
c QELHQANS S PLLRGSH PME I - 

tctgggcctgggagttgggaagcagagcctccaagaacctttctcaaagcccaggagaag 
2041 . — + + + + + + 2100 

agacccggaccctcaacccttcgtctcggaggttcttggaaagagtttcgggtcctcttc 
C WAWELGSRASKNLSQS PGEA- 
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cgccccgagccgtggtatcctggcgtgctgtgggaccacaacttcaccgcaactccagtg 

2101 + + + + + + 2160 

gcggggctcggcaccataggaccgcacgacaccctggtgttgaagtggcgttgaggtcac 

c PRAVVS WRAVGPQLHRNSSE- 

agctggcatctcgtccactccctccaacacccctttctcttcgtggagcttccagtcatg 

2161 + + + + + + 2220 

tcgaccgtagagcaggtgagggaggttgtggggaaagagaagcacctcgaaggtcagtac 

c LAS RPLPPTPLS LRGASSHD- 

acccacagagccagtgtgtggagaagctccaagctccctcctctgacccactgccagcag 

2221 + r — + + + + + 2280 

tgggtgtctcggtcacacacctcttcgaggttcgagggaggagactgggtgacggtcgtc 

C PQS QCVEKLQAPS S DP L P A A - 

cccctctctccgtcctcaactcttccagaccttccagcccccaggcctctttcctctcct 
2281 + + + + + + 2340 

ggggagagaggcaggagttgagaaggtctggaaggtcgggggtccggagaaaggagagga 

C PLSVLNSSRPSSPQASFLSC- 

gtcctagcccatcctccagcaacctgtccagctcctcgctgtcatccttagaggaggagg 
2341 + + + + + + 2400 

caggatcgggtaggaggtcgttggacaggtcgaggagcgacagtaggaatctcctcctcc 

C PSPSSSNLSSSSLSSLEEEE- 

aggatcaggacagcgtgctcacccccgaggaggtagccctgtgtctggagctcagtgatg 
2401 + + + * + + 2460 

tcctagtcctgtcgcacgagtgggggctcctccatcgggacacagacctcgagtcactac 

C DQDSVLT'PEEVALCLELSDG- 

gggaggagacacccacgaacagtgtatctcctatgccaagagctccttccccgccaacaa 
2461 + + + + + + 2520 

ccctcctctgtgggtgcttgtcacatagaggatacggttctcgaggaaggggcggttgtt 

C EETPTNSVSPMPRAPSPPTT- 

cctatggctatatcagcataccaacctgctcaggactggcagacatgggcagagctggcg 
2521 +~ + + + + + 2580 

ggataccgatatagtcgtatggttggacgagtcctgaccgtctgtacccgtctcgaccgc 
C YGYI SI PTCSGLADMGRAGG- 

ggggcgtggggtctgaggttgggaacttactgtatccacctcggccctgccccaccccta 
2581 + + + + + + 2640 

ccccgcaccccagactccaacccttgaatgacataggtggagccgggacggggtggggat 
C GVGS EVGNLLYPPRPC PTPT- 

cacccagcgagggctccctggccaatggttggggctcagcttctgaggacaatgtcccca 
2641 + + +- + + + 2700 



gtgggtcgctcccgagggaccggttaccaaccccgagtcgaagactcctgttacaggggt 
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c PSEGSLANGWGSASE DNVPS- 

gcgccagggccagcctggttagctcttctgatggctccttcctcgctgatactcactttg 
2701 + + + + + + 2760 

cgcggtcccggtcggaccaatcgagaagactaccgaggaaggagcgactatgagtgaaac 

C ARASLVS SSDGSFLADTHFA- 

ctcgtgccctggcagtggctgtggatagctttggcctcagtctggatcccagggaagctg 

2761 + + + + : + + 2820 

gagcacgggaccgtcaccgacacctatcgaaaccggagtcagacctagggtcccttcgac 

C RALAVAVDS FGLS LD P READ- 

actgtgtcttcactgatgcctcatcacctccctcccctcggggtgatctctccctgaccc 
2821 + + + + + + 2880 

tgacacagaagtgactacggagtagtggagggaggggagccccactagagagggactggg 

c cvftdassppsprgdlsl.tr-. 

gaagcttctctctgcctttgtgggagtggaggccagactggttggaagatgctgagatca 
2881 + + + + + + 2940 

cttcgaagagagacggaaacaccctcacctccggtctgaccaaccttctacgactctagt 
C SFSL PLWEWRPDWLE DAEIS- 

gccacacccagaggctggggagggggctgcctccctggcctcctgattctagggcctctt 
2941 + + + + + + 3 000 

cggtgtgggtctccgacccctcccccgacggagggaccggaggactaagatcccggagaa 

*C HTQRLGRGLPPWPPDS RASS- 

cccagcgaagttggctaactggtgctgtgcccaaggctggtgattcctcctgaattgtcc 
3001 + + ; + + + + 306O 

gggtcgcttcaaccgattgaccacgacacgggttccgaccactaaggaggacttaacagg 
c QRSWLTGAVPKAGDSS * 

ctgagaaggccagaagagcacccagaccactctcctgtctgtcccctggctttctcacat 
3061 + + + + + + 3120 

gactcttccggtcttctcgtgggtctggtgagaggacagacaggggaccgaaagagtgta 

c 

gtggaggtcttggcctatgcttctctgtaatagaagtccaccgtcactaggcttctggag 
3121 + + + + + + 3180 

cacctccagaaccggatacgaagagacattatcttcaggtggcagtgatccgaagacctc 

c 

agctctgtcattgggattgttaaaataaatgaaagcaaaccaaaatatgatcacgggagt 
3181 + + + + + + 3240 

tcgagacagtaaccctaacaattttatttactttcgtttggttttatactagtgccctca 
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cttggattcccactgagaacaagacagcatcttcaggacagcagactctccacaaccaga 

3241 + + + + 4- + 3300 

gaacctaagggtgactcttgttctgtcgtagaagtcctgtcgtctgagaggtgttggtct 

c 

acctttggcctaagtaagcctggctccggagctcccacctaagtggatcatggaaagaag 

3301 + + + + + + 3360 

tggaaaccggattcattcggaccgaggcctcgagggtggattcacctagtacctttcttc 

c 

ggaagccaaccaggtcttcaggaaggacagaaatgttttttggtgagggctatggtggag 

3361 + + + + — + + 3420 

ccttcggttggtccagaagtccttcctgtctttacaaaaaaccactcccgataccacctc 

c MFFGEGYGGG- 

gacctgtggaagagccctctcatatctacttggactcctcccttagaggccagctcaacc 

3421 + : + + + + + 3480 

ctggacaccttctcgggagagtatagatgaacctgaggagggaatctccggtcgagttgg 

C PVEEPSHIYLDSSLRGQLNP- 

ctttccccagtcacaccatgcaaggaaactaaaggagaaaggtcgtggatgcagtgggcc 

3481 + + + + + + 3540 

gaaaggggtcagtgtggtacgttcctttgatttcctctttccagcacctacgtcacccgg 

c FPSHTMQGN* 

ctatacagcgtcacagtcaatgcttcaaagtgagatcaatggaggagactgaaggaaagg 

3541 + + + + + + 3600 

gatatgtcgcagtgtcagttacgaagtttcactctagttacctcctctgacttcctttcc 

C MEET EG KD- 

acgcagggaaacagggaaccaatgcgctattctcattctaccgccactctgagcttaagg 

3601 + + + + + + 3660 

tgcgtccctttgtcccttggttacgcgataagagtaagatggcggtgagactcgaattcc 

c AGKQGTNAL FS FYRHS ELKE- 

aacttaattctataaaactgtaaagacg 

3661 + + 3688 

ttgaattaagatattttgacatttctgc 

c LNSIKL* 
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BESTFIT OF: MR. PEP CHECK: 5275 FROM: 1 TO: 1104 

TO: MUMR_1030818.PEP CHECK: 6771 FROM: 1 TO: 1228 

TRANSLATE OF: MUMR.SEQ CHECK: 370 FROM: 3 TO: 3688 
GENERATED SYMBOLS 1 TO: 1228. 

REFORMAT OF: MUMR.SEQ CHECK: 370 FROM: 1 TO: 3688 
SYMBOL COMPARISON TABLE: 

/MOLBIO0 /SOFTWARE/ GCG/ GCGCORE/DATA/RUNDATA/BLOSUM62 . CMP 
COMPCHECK: 6430 

GAP WEIGHT: 8 AVERAGE MATCH: 2.912 

LENGTH WEIGHT: 2 AVERAGE MISMATCH: -2.003 

QUALITY: 4035 LENGTH: 1081 

RATIO: 3.764 GAPS: 7 

PERCENT SIMILARITY: 77.392 PERCENT IDENTITY: 74.390 

MATCH DISPLAY THRESHOLDS FOR THE ALIGNMENT (S) : 
| = IDENTITY 
: = 2 
. = 1 

MR. PEP X MUMR.PEP 



1 MGSGGDSLLGGRGSLPLLLLLIMGGMAQDSPPQILVHPQDQLFQGPGPAR 50 
I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : 

12 MGSGGTGLLGTEWPLPLLLLFIMGGEALDSPPQILVHPQDQLLQGSGPAK 61 

• • • • • 

51 MSCQASGQPPPTIRWLLNGQPLSMVPPDPHHLLPDGTLLLLQPPARGHAH 100 

I I I II I I I I I I I I I I I I I I I I II I : I I I I I I I I I .1 .1 
62 MRCRSSGQPPPTIRWLLNGQPLSMATPDLHYLLPDGTLLLHRPSVQGRPQ 111 

• • • • • 

101 DGQ . ALSTDLGVYTCEASNRLGTAVSRGARLSVAVLREDFQIQPRDMVAV 149 

I I M I I I I I I I I I I I I | | I I I I I I I I I I | | | . | | | | | | | | | IN 
112 DDQNILSAILGVYTCEASNRLGTAVSRGARLSVAVLQEDFQIQPRDTVAV 161 

• • • • • 

150 VGEQFTLECGPPWGHPEPTVSWWKDGKPLALQPGRHTVSGGSLLMARAEK 199 

III I I I I I I I I : | . I . I | | | I I I I I I I I I I I I I I I I I :. . | I | | 
162 VGESLVLECGPPWGYPKPSVSWWKDGKPLVLQPGRRTVSGDSLMVSRAEK 211 

• • • • • 

200 SDEGTYMCVATNSAGHRESRAARVSIQEPQDYTEPVELLAVRIQLENVTL 249 

•I Mill. III. II llllllllllll II: I • I I I I I I I I I I I I I I 
212 NDS GT YMCMATNN AGQRES RAARVS I QE S Q DHKE HLELLAVRI QLENVTL 261 
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250 LNPDPAEGPKPRPAWLSWKVSGPAAPAQSYTALFRTQTAPGGQGAPWAE 299 

I I I: I .1111 I - I I I I I I I I I I I I I I : I I ) I I I I I I .1 I I. I I I 
262 LNPEPVKGPKPGPSVWLSWKVSGPAAPAESYTALFRTQRSPRDQGSPWTE 311 

• • • • • 

300 ELLAGWQSAELGGLHWGQDYEFKVRPSSGRARGPDSNVLLLRLPEKVPSA 349 

II I I I I • I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . I I I I 
312 VLLRGLQSAKLGGLHWGQDYEFKVRPSSGRARGPDSNVLLLRLPEQVPSA 361 

• • • • • 

350 P PQE VTLKPGNGT VFVS WVP P PAENHNG 1 1 RG YQ VWS LGNT S L PPANWT V 399 

Ml III: I I I • I I I I I I I I I I • I I I : I I I I II I I I I I III I I I I I 
362 P PQG VTLRS GNG S VFVS WAP P PAE S HNG V I RG YQVWS LGNAS L PAANWTV 411 

• • • • • • 

400 VGEQTQLEIATHMPGSYCVQVAAVTGAGAGEPSRPVCLLLEQAMERATQE 449 

I I I I I I I I I I I : I H I I I I I I I I I I I I I I I I I I I I I I I I I | | .. .: 
412 VGEQTQLEIATRLPGSYCVQVAAVTGAGAGELSTPVCLLLEQAMEQSARD 4 61 

• • • • • 

450 PSEHGPWTLEQLRATLKRPEVIATCGVALWLLLLGTAVCIHRRRRARVHL 499 

I -I I Ml I I I I I I I : I I I I II - I I I I I I I I 111:111:1 III 
462 PRKHVPWTLEQLRATLRRPEVIASSAVLLWLLLLGITVCIYRRRKAGVHL 511 

• « • » » 

500 GPGLYRYTSEDAILKHRMDHSDSQWLADTWRSTSGSRDLSSSSSLSSRLG 549 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

512 GPGLYRYTSEDAILKHRMDHSDSPWLADTWRSTSGSRDLSSSSSLSSRLG 561 

• • • • • 

550 ADARDPLDCRRSLLSWDSRSPGVPLLPDTSTFYGSLIAELPSSTPARPSP 599 

I MM: M M : M I M M M M M M M M M M I Ml I MM 
562 LDPRDPLEGRRSLISWDPRSPGVPLLPDTSTFYGSLIAEQPSSPPVRPSP 611 

• • • • • 

600 QVPAVRRLPPQLAQLS S PCS SS DSLCSRRGLS S PRLS LAPAEAWKAKKKQ 649 

• II II I .11 Ml • I I I I I I I I I I I 111:11 I I I I I I I I I I 
612 KTPAARRFPSKLAGTSSPWASSDSLCSRRGLCSPRMSLTPTEAWKAKKKQ 661 

■ • • • « 

650 ELQHANSSPLLRGSHSLELRACELGNRGSKNLSQSPGAVPQALVAWRALG 699 

II I I I I I I I I I I I : I : I I I I • I I I I I I I I I I I . I • I . I I I . I 

662 ELHQANSSPLLRGSHPMEIWAWELGSRASKNLSQSPGEAPRAWSWRAVG 711 

• • • • • 

700 PKLLSSSNELVTRHLPPAPLFPHETPPTQSQQTQPPVAPQAPSSILLPAA 749 

I • I • I . I I • I I I I I I . |.| I I I I I I I I I 

712 PQLHRNSSELASRPLPPTPL . SLRGASSHDPQSQCVEKLQAPSSDPLPAA 760 

• • • • m 

750 PIPILSPCSPPSPQASSLSGPSPASSRLSSSSLSSL. . GEDQDSVLTPEE 797 

I : : I • I I I I I I II I I I • I I I I I I I I I I I I I I I I I I I I I I 
761 PLSVLNSSRPSSPQASFLSCPSPSSSNLSSSSLSSLEEEEDQDSVLTPEE 810 
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798 VALCLELSEGEETPRNSVS PMPRAPSPPTTYGYISVPTASEFTDMGRTGG 847 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I II 
811 VALCLELSDGEETPTNSVSPMPRAPSPPTTYGYISIPTCSGLADMGRAGG 860 

• • • • • 

848 GVGPKGGVLLCPPRPCLTPTPSEGSIANGWGSASEDNAASARASLVSSSD 897 

III . I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
861 GVGSEVGNLLYPPRPCPTPTPSEGSLANGWGSASEDNVPSARASLVSSSD 910 

• • • • • 

898 G S FLADAH FARALAVAVDS FG FGLE PREADC VFI DAS S P PS PRDE I FLT P 947 

I I I I I I I I I I I I I I II I I I I I : I I I I I I I I I I I I I I I I I : : II 
911 G S FLADTH FARALAVAVDS FGL S LD PRE ADC VFT DASSPPSPRGDLSLTR 960 

• • • • • 

948 NLSLPLWEWRPDWLEDMEVSHTQRLGRGMPPWPPELSDLFPEKSAPLSYA 997 

• I I I I I I II I I I I I I I : I I I I I I I I I : I I I I I : |. |: 

961 SFSLPLWEWRPDWLEDAEISHTQRLGRGLPPWPPD SRASSQRSWL 1005 

• • ■ • • 

998 QGWCFSCRLLLNRVPETSQTGIRTTSPVPPTRPG . LWCVGLGLCFSAAGV 1046 
I III .. I I. M I I MUM! | 

1006 TGAVPKAGDSS*IVPEKAR. . . RAPRPLSCLSPGFLTCGGLGLCFSVIEV 1052 

• • • 

1047 . .HLPKPPESSPSTIVKTNENKIRAKLTWSP 1075 

I I : I I • : • I I 

1053 HRH*ASGELCHWDC*NK*KQTKI*SRESWIP 1083 
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34 AGTGCTCGGGACAAGGACATAGGGCTGAGAGTAGCCATGGGCTCTGGAGG 83 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 AGTGTATGGGACAAGGAGA . GGAGCCGAGAGCAGCCATGGGCTCTGGAGG 49 

• • • . . 

84 AGACAGCCTCCTGGGGGGCAGGGGTTCCCTGCCTCTGCTGCTCCTGCTCA 133 

I IMIIIIIIII II I I I I I I I I I I I I I I I II III 

50 AACGGGCCTCCTGGGGACGGAGTGGCCTCTGCCTCTGCTGCTGCTTTTCA 99 

• ... a 

134 TCATGGGAGGCATGGCTCAGGACTCCCCGCCCCAGATCCTAGTCCACCCC 183 

I I I I I I I I I I I I I I I III II II I I I I I I I I I I I I I I I I I I I I 
100 TCATGGGAGGTGAGGCTCTGGATTCTCCACCCCAGATCCTAGTTCACCCC 149 

• • • . . 

184 CAGGACCAGCTGTTCCAGGGCCCTGGCCCTGCCAGGATGAGCTGCCAAGC 233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
150 CAGGACCAGCTACTTCAGGGCTCTGGCCCAGCCAAGATGAGGTGCAGATC 199 
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• • • * • 

234 CTCAGGCCAGCCACCTCCCACCATCCGCTGGTTGCTGAATGGGCAGCCCC 283 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
200 ATCCGGCCAACCACCTCCCACTATCCGCTGGCTGCTGAATGGGCAGCCCC 249 

• • • • • 

284 TGAGCATGGTGCCCCCAGACGCACACCACCTCCTGCCTGATGGGACCCTT 333 



250 TCAGCATGGCCACCCCAGACCTACATTACCTTTTGCCGGATGGGACCCTC 299 

• * • • • 

334 CTGCTGCTACAGCCCCCTGCCCGGGGACATGCCCACGATGGCCAG . . . GC 380 

III I I I MM III II Mill I II MM MM 
300 CTGTTACATCGGCCCTCTGTCCAGGGACGGCCACAAGATGACCAGAACAT 349 

• • » • « 

381 CCTGTCCACAGACCTGGGTGTCTACACATGTGAGGCCAGCAACCGGCTTG 430 

I I I M II II I II II I II I I I I I I I I I I II I II I II I I I I I I I I I 
350 CCTCTCAGCAATCCTGGGTGTCTACACATGTGAGGCCAGCAACCGGCTGG 399 

• • • « • 

431 GCACGGCAGTCAGCAGAGGCGCTCGGCTGTCTGTGGCTGTCCTCCGGGAG 480 

till I I I I I III I II III IIIIIIIIMIIIIIIIII.il I I I I 
400 GCACAGCAGTGAGCCGGGGTGCTAGGCTGTCTGTGGCTGTCCTCCAGGAG 449 

• • • • * 

481 GATTTCCAGATCCAGCCTCGGGACATGGTGGCTGTGGTGGGTGAGCAGTT 530 

M MIIMIMM MINIMI! Mill I I I I i I I I Ml II 
450 GACTTCCAGATCCAACCTCGGGACACAGTGGCCGTGGTGGGAGAGAGCTT 499 

• • • . 

531 TACTCTGGAATGTGGGCCGCCCTGGGGCCACCCAGAGCCCACAGTCTCAT 580 

Ml II II II I II I II II II II I II I I I III I I II I II I 
500 GGTTCTTGAGTGTGGTCCTCCCTGGGGCTACCCAAAACCCTCGGTCTCAT 549 

• » • • • 

581 GGTGGAAAGATGGGAAACCCCTGGCCCTCCAGCCCGGAAGGCACACAGTG 630 

M M I I I II I II I II I I II II I I I I I I II I II II II II I II I I I 
550 GGTGGAAAGACGGGAAACCCCTGGTCCTCCAGCCAGGGAGGCGCACAGTA 599 

• • • • • 

631 TCCGGGGGGTCCCTGCTGATGGCAAGAGCAGAGAAGAGTGACGAAGGGAC 680 
M MM MUM II II I I I 1 I I I I i I I I I I I MM Mill 

600 TCTGGGGATTCCCTGATGGTGTCAAGAGCAGAGAAGAATGACTCGGGGAC 649 

• • • • . 

681 CTACATGTGTGTGGCCACCAACAGCGCAGGACATAGGGAGAGCCGCGCAG 730 
Ml I I I M I MMMIIMM II II II I II II II I I I II I I 

650 CTATATGTGTATGGCCACCAACAATGCTGGGCAACGGGAGAGCCGAGCAG 699 

• . . 

731 CCCGGGTTTCCATCCAGGAGCCCCAGGACTACACGGAGCCTGTGGAGCTT 780 
M M I I II I I II II II II I I II II III III I I I | M II I 

700 CCAGGGTGTCTATCCAGGAATCCCAGGACCACAAGGAACATCTAGAGCTT 749 
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781 CTGGCTGTGCGAATTCAGCTGGAAAATGTGACACTGCTGAACCCGGATCC 830 
MINIM M II I I I I II I II I II I i I II I INN INN M II 

750 CTGGCTGTTCGCATTCAGCTGGAAAATGTGACCCTGCTAAACCCCGAACC 799 

• • • • • 

831 TGCAGAGGGCCCCAAGCCTAGACCGGCGGTGTGGCTCAGCTGGAAGGTCA 880 

II I I II I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I 
800 TGTAAAAGGTCCCAAGCCTGGGCCATCCGTGTGGCTCAGCTGGAAGGTGA 849 

• • • • • 

881 GTGGCCCTGCTGCGCCTGCCCAATCTTACACGGCCTTGTTCAGGACCCAG 930 

I Mlllllllll Mill I II Mill II llllllllll III 
850 GCGGCCCTGCTGCACCTGCTGAGTCATACACAGCTCTGTTCAGGACTCAG 899 

• • • . • 

931 ACTGCCCCGGGAGGCCAGGGAGCTCCGTGGGCAGAGGAGCTGCTGGCCGG 980 

I INI I I III II I III I Ml III III III MM M 

900 AGGTCCCCCAGGGACCAAGGATCTCCATGGACAGAGGTGCTGCTGCGTGG 949 

• • • • • 

981 CTGGCAGAGCGCAGAGCTTGGAGGCCTCCACTGGGGCCAAGACTACGAGT 1030 

M Mllll III III II I I II I I I I I I I I I I I I I I I I I I I I II I 
950 CTTGCAGAGTGCAAAGCTTGGGGGTCTCCACTGGGGCCAAGACTATGAAT 999 

• • » » • 

1031 TCAAAGTGAGACCATCCTCTGGCCGGGCTCGAGGCCCTGACAGCAACGTG 1080 

I I I I I I I I I I I I I Mill IIIIIIMIIIIIIIIIIIIIIIIII III 

1000 TCAAAGTGAGACCGTCCTCCGGCCGGGCTCGAGGCCCTGACAGCAATGTG 1049 

• • • • . 
1081 CTGCTCCTGAGGCTGCCGGAAAAAGTGCCCAGTGCCCCACCTCAGGAAGT 1130 

I I II I I I I I I I I I I I I III | I I I I I I I I I I I | | | | I I I I I I III 
1050 TTGCTCCTGAGGCTGCCTGAACAGGTGCCCAGTGCCCCACCTCAAGGAGT 1099 

• • • . • • 

1131 GACTCTAAAGCCTGGCAATGGCACTGTCTTTGTGAGCTGGGTCCCACCAC 1180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1100 GACCTTAAGATCTGGCAACGGTAGTGTCTTTGTGAGTTGGGCTCCACCAC 1149 

• • • • • 

1181 CTGCTGAAAACCACAATGGCATCATCCGTGGCTACCAGGTCTGGAGCCTG 1230 

I I I I I I I I I III I I I I I llllllllll I I I.I I I I I I I I I I I I I | I 
1150 CTGCTGAAAGCCATAATGGTGTCATCCGTGGTTACCAGGTCTGGAGCCTG 1199 

• • • • • 

1231 GGCAACACATCACTGCCACCAGCCAACTGGACTGTAGTTGGTGAGCAGAC 1280 

I I I I I I III I I I I I | II I I I I I I I | | | I I I I I I I I I I I I I 
1200 GGCAATGCCTCATTGCCTGCTGCCAACTGGACCGTAGTGGGTGAACAGAC 1249 

• • • • • 

1281 CCAGCTGGAAATCGCCACCCATATGCCAGGCTCCTACTGCGTGCAAGTGG 1330 

I I I I I I I I I I I I I I I I | | I I I I I I I I I I I I I II llllllllll 
1250 CCAGCTGGAGATCGCCACACGACTGCCAGGCTCCTATTGTGTGCAAGTGG 1299 
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• • « « * 

1331 CTGCAGTCACTGGTGCTGGAGCTGGGGAGCCCAGTAGACCTGTCTGCCTC 1380 

I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I 
1300 CTGCAGTCACTGGAGCTGGTGCTGGAGAACTCAGTACCCCTGTCTGCCTC 1349 

1381 CTTTTAGAGCAGGCCATGGAGCGAGCCACCCAAGAACCCAGTGAGCATGG 1430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1350 CTTTTAGAGCAGGCCATGGAGCAATCAGCACGAGACCCCAGGAAACATGT 1399 

1431 TCCCTGGACCCTGGAGCAGCTGAGGGCTACCTTGAAGCGGCCTGAGGTCA 1480 

I I I I I I II I I I I I I I lllllllllll llll II! II II II INI 
1400 TCCCTGGACCCTGGAACAGCTGAGGGCCACCTTGAGACGACCAGAAGTCA 1449 

• • • • • 

1481 TTGCCACCTGCGGTGTTGCACTCTGGCTGCTGCTTCTGGGCACCGCCGTG 1530 

I I I I I I I I I I I 11111111111111111111 llll 
1450 TTGCCAGTAGTGCTGTCCTACTCTGGTTGCTGCTACTAGGCATTACTGTG 1499 

• • • • • 

1531 TGTATCCACCGCCGGCGCCGAGCTAGGGTGCACCTGGGCCCAGGTCTGTA 1580 

I I I I I I II I II III llll I I I I I I I I I I I I I I I I I I I I I I I I I 

1500 TGTATCTACAGACGACGCAAAGCTGGGGTGCACCTGGGCCCAGGTCTGTA 1549 

• • • • • 

1581 CAGATATACCAGTGAGGATGCCATCCTAAAACACAGGATGGATCACAGTG 1630 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1550 CAGATACACCAGCGAGGACGCCATTCTAAAACACAGGATGGACCACAGTG 1599 

• • • • • 

1631 ACTCCCAGTGGTTGGCAGACACTTGGCGTTCCACCTCTGGCTCTCGGGAC 1680 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
1600 ACTCCCCATGGCTGGCAGACACCTGGCGTTCCACCTCTGGCTCTCGAGAC 1649 

• ♦ • • • 

1681 CTGAGCAGCAGCAGCAGCCTCAGCAGTCGGCTGGGGGCGGATGCCCGGGA 1730 

I I I I I I I I I I I II I I I I I I I II lllllllllll III I I I I I I 
1650 CTGAGCAGCAGCAGCAGCCTTAGTAGTCGGCTGGGATTGGACCCTCGGGA 1699 

• • • • > 

1731 CCCACTAGACTGTCGTCGCTCCTTGCTCTCCTGGGACTCCCGAAGCCCCG 1780 

I I I I I I I I I I I I I I I I I I I I lllllllllll I II I I I I I I I 
1700 CCCACTAGAGGGCAGGCGCTCCTTGATCTCCTGGGACCCTCGGAGCCCCG 1749 

• • • • ■ 

1781 GCGTGCCCCTGCTTCCAGACACCAGCACTTTTTATGGCTCCCTCATCGCT 1830 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I lllllllllll II 
1750 GTGTACCCCTGCTTCCAGACACGAGCACGTTTTACGGCTCCCTCATTGCA 1799 

• • • • • 

1831 GAGCTGCCCTCCAGTACCCCAGCCAGGCCAAGTCCCCAGGTCCCAGCTGT 1880 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1800 GAGCAGCCTTCCAGCCCTCCAGTCCGGCCAAGCCCCAAGACACCAGCTGC 1849 
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• • « • * 

1881 CAGGCGCCTCCCACCCCAGCTGGCCCAGCTCTCCAGCCCCTGTTCCAGCT 1930 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1850 TAGGCGCTTTCCATCCAAGTTGGCTGGAACCTCCAGCCCCTGGGCTAGCT 1899 

• • • • • 

1931 CAGACAGCCTCTGCAGCCGCAGGGGACTCTCTTCTCCCCGCTTGTCTCTG 1980 

I I I I II I I I I I I I I I I I I I I I I I I I I I I III II III I I I I I I I I 
1900 CAGATAGTCTCTGCAGCCGCAGGGGACTCTGTTCCCCACGCATGTCTCTG 1949 

• • • • • 

1981 GCCCCTGCAGAGGCTTGGAAGGCCAAAAAGAAGCAGGAGCTGCAGCATGC 2030 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

1950 ACCCCTACAGAGGCTTGGAAGGCCAAAAAGAAGCAGGAATTGCACCAAGC 1999 

2031 CAACAGTTCCCCACTGCTCCGGGGCAGCCACTCCTTGGAGCTCCGGGCCT 2080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I II I I I I I I 
2000 TAACAGCTCCCCACTGCTCCGGGGCAGCCACCCCATGGAAATCTGGGCCT 2049 

2081 GTGAGTTAGGAAATAGAGGTTCCAAGAACCTTTCCCAAAGCCCAGGAGCT 2130 

I Mill Mil MM I I I I I I I I II I I I I MM 

2050 GGGAGTTGGGAAGCAGAGCCTCCAAGAACCTTTCTCAAAGCCCAGGAGAA 2099 

• # * • • 

2131 GTGCCCCAAGCTCTGGTTGCCTGGCGGGCCCTGGGACCGAAACTCCTCAG 2180 

I II I I I II I I II I I I I I I I I II II I I I I I II I I I I I 
2100 GCGCCCCGAGCCGTGGTATCCTGGCGTGCTGTGGGACCACAACTTCACCG 2149 

2181 CTCCTCAAATGAGCTGGTTACTCGTCATCTCCCTCCAGCACCCCTCTTTC 2230 

I III I I I II I II I I I I I I I I I I I II I I I I I I I I I I I II 
2150 CAACTCCAGTGAGCTGGCATCTCGTCCACTCCCTCCAACACCCCTTTCTC 2199 

• • • • • 

2231 CTCATGAAACTCCCCCAACTCAGAGTCAACAGACCCAGCCTCCGGTGGCA 2280 

II I I I I I II I III I II I I I I I I I I III 

2200 TTCGTGGAGCTTCC . . . AGTCATGACCCACAGAGCCAGTGTGTGGAGAAG 2246 

• • • • • 

2281 CCACAGGCTCCCTCCTCCATCCTGCTGCCAGCAGCCCCCATCCCCATCCT 2330 

I I I I I I I I I I I II I II I I I I I I I I I II I I I I I I I I I I I 
2247 CTCCAAGCTCCCTCCTCTGACCCACTGCCAGCAGCCCCTCTCTCCGTCCT 2296 

• • * • • 

2331 TAGCCCCTGCAGTCCCCCTAGCCCCCAGGCCTCTTCCCTCTCTGGCCCCA 2380 

I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I I I I 
2297 CAACTCTTCCAGACCTTCCAGCCCCCAGGCCTCTTTCCTCTCCTGTCCTA 2346 

■ * • • • 

2381 GCCCAGCTTCCAGTCGCCTGTCCAGCTCCTCACTGTCATCCCT G 2424 

I I M I I Mill II II I I I I I I I I II I II I I II II I I I 
2347 GCCCATCCTCCAGCAACCTGTCCAGCTCCTCGCTGTCATCCTTAGAGGAG 2396 
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• • ♦ • * 

2425 GGGGAGGATCAAGACAGCGTGCTGACCCCTGAGGAGGTAGCCCTGTGCTT 2474 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
2397 GAGGAGGATCAGGACAGCGTGCTCACCCCCGAGGAGGTAGCCCTGTGTCT 2446 

• • • • • 

2475 GGAACTCAGTGAGGGTGAGGAGACTCCCAGGAACAGCGTCTCTCCCATGC 2524 

Ml llllllll II llllllll MM MINI || Mill | | | | 
2447 GGAGCTCAGTGATGGGGAGGAGACACCCACGAACAGTGTATCTCCTATGC 2496 

• • • « • 

2525 CAAGGGCTCCTTCACCCCCCACCACCTATGGGTACATCAGCGTCCCAACA 2574 

I I N llllllll II II II llllllll II I I I I I I I I I I | I 
2497 CAAGAGCTCCTTCCCCGCCAACAACCTATGGCTATATCAGCATACCAACC 2546 

2575 GCCTCAGAGTTCACGGACATGGGCAGGACTGGAGGAGGGGTGGGGCCCAA 2624 

Mill I I II II I I II I II I I I 1 II II I I I I I I I I 
2547 TGCTCAGGACTGGCAGACATGGGCAGAGCTGGCGGGGGCGTGGGGTCTGA 2596 

... 
2625 GGGGGGAGTCTTGCTGTGCCCACCTCGGCCCTGCCTCACCCCCACCCCCA 2674 

M II I II I I I I I I I I I II I I I I I I I I I I I I | I | | | | I I I 
2597 GGTTGGGAACTTACTGTATCCACCTCGGCCCTGCCCCACCCCTACACCCA 2646 
• 

2675 GCGAGGGCTCCTTAGCCAATGGTTGGGGCTCAGCCTCTGAGGACAATGCC 2724 

M II I I I I I I I I III MINIM. Ml I I I I I I I I I I I I I I 

2647 GCGAGGGCTCCCTGGCCAATGGTTGGGGCTCAGCTTCTGAGGACAATGTC 2696 

• • . . . 
2725 GCCAGCGCCAGAGCCAGCCTTGTCAGCTCCTCCGATGGCTCCTTCCTCGC 2774 

MIIMIMI llllllll II Mill II [ I I I I I 1 I ) I I I I I I I I 
2697 CCCAGCGCCAGGGCCAGCCTGGTTAGCTCTTCTGATGGCTCCTTCCTCGC 2746 

• • « ... 
2775 TGATGCTCACTTTGCCCGGGCCCTGGCAGTGGCTGTGGATAGCTTTGGTT 2824 

MM II I II II I I I II I II II II I I II I II II II I I I II II I I I I 
2747 TGATACTCACTTTGCTCGTGCCCTGGCAGTGGCTGTGGATAGCTTTGGCC 2796 
• 

2825 TCGGTCTAGAGCCCAGGGAGGCAGACTGCGTCTTCATAGATGCCTCATCA 2874 

M MM II llllllll II I I I I I I I I I I I | | I | | | I | |.| || | 
2797 TCAGTCTGGATCCCAGGGAAGCTGACTGTGTCTTCACTGATGCCTCATCA 2846 

* 

2875 CCTCCCTCCCCACGGGATGAGATCTTCCTGACCCCCAACCTCTCCCTGCC 2924 

I M I I II II II I II I Ml IN llllllll I I I II I I II II 
2847 CCTCCCTCCCCTCGGGGTGATCTCTCCCTGACCCGAAGCTTCTCTCTGCC 2896 

• ... • . 
2925 CCTGTGGGAGTGGAGGCCAGACTGGTTGGAAGACATGGAGGTCAGCCACA 2974 

M M M I I I II II I I | | | | | | | | | | | | | | | | Ml I I I II I I I I 

2897 TTTGTGGGAGTGGAGGCCAGACTGGTTGGAAGATGCTGAGATCAGCCACA 2946 
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2975 CCCAGCGGCTGGGAAGGGGGATGCCTCCCTGGCCCCCTGAACTCTCAGAT 3024 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 
2947 CCCAGAGGCTGGGGAGGGGGCTGCCTCCCTGGCCTCCTG . ATTCTAGGGC 2995 

» • • • • 

3025 CTCTTCCCAGAGAAGTCAGCTCCACTGTCGTATGCCCAAGGCTGGTGCTT 3074 

I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I 
2996 CTCTTCCCAGCGAAGTTGGCTAACTGGTGCTGTGCCCAAGGCTGGT 3041 

• • • • • 

3075 CTCCTGTAGATTACTCCTGAACCGTGTCCCTGAGACTTCCCAGACGGGAA 3124 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3042 GATTCCTCCTGAA . . TTGTCCCTGAGA . AGGCCAGAAGAGCA 3080 

■ • • • • 

3125 TCAGAACCACTTCTCCTGTTCCACCCACAAGACCTGG . . . GCTGTGGTGT 3171 

I I I I I I I I I I I I I I II I I I I I II III 
3081 CCCAGACCAC . TCTCCTGTCTGTCC CCTGGCTTTCTCACATGT 3122 

• • • • • 

3172 GTGGGTCTTGGCCTGTGTTTCTCTGCAGCTGGGGTCCACCTTC . CCAAGC 3220 
I I I I I I I I I I I I I I I I I I I I I I I II M I I I II I I II 

3123 GGAGGTCTTGGCCTATGCTTCTCTGTAATAGAAGTCCACCGTCACTAGGC 3172 

• • • • • 
3221 CTCCAGAGAGTTCTCCCTCCACGATTGTGAAAACAAATG AAAACA 3265 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3173 TTCTGGAGAGCTCTGTCATTGGGATTGTTAAAATAAATGAAAGCAAACCA 3222 

• • • • • 

3266 AAATTAGAGCAAAGCTGACCTGGA . GCCCTCAGGGAGCAAAACATCATCT 3314 

I I I I I I II I I I I I I I I I I I I II | | | | | | | | | || 
3223 AAATATGATCACGGGAGTCTTGGATTCCCACTGAGAACAAGACAGCATCT 3272 

• • • • • 

3315 CCACCTGACTCCTAGCCACTGCTTTCTCCTCTGTGCCATCCACTCCCACC 3364 

M I I I I III I I I I I I I I 

3273 TCA. GGACAGCAGACTC TCCACAACCAGA 3300 

• • • • • 

3365 ACCAGGTTGTTTTGGCCTGAGGAGCAGCCCTGCCTGCTGCTCTTCCCCCA 3414 

Ml I I I I I I I I III I I I I I I I I II III 

3301 ACC TTTGGCCT. . .AAGTAAGCCTGGCTCCGGAGCT. .CCCAC 3338 

• • • • • 

3415 CCATTTGGATCACAGGAAGTGGAGGAGCCAGAGGTGCCTTTGTGGAGGAC 34 64 

M lllllll I III I I Mill I III I Mill 

3339 CTAAGTGGATCATGGAAAGAAGGGAAGCCAACCAGGTCTTCAGGAAGGAC 3388 

• • • • . • 

3465 AGCAGTGGCTGCTGGGAGAGGGCTGTGGAGGAAGGAGCTTCTCGGAGCCC 3514 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3389 AGAAAT . GTTTTTTGGTGAGGGCTATGGTGGA . . GGACCTGTGGAAGAGC 3435 
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3515 CCTCTCAGCCTTACCTGGGCCCCTCCTCTAGAGAAGAGCTCAACTCTCT. 3563 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3436 CCTCTCATATCTACTTGGACTCCTCCCTTAGAGGCCAGCTCAACCCTTTC 3485 

• • • • • 

3564 CCCAACCTCACCATGGAAAGAAAAT . AATTATGAATGCCACTGAGGCACT 3612 

I I I I I I I 1 I I I I I I I I I I I I I I I I | I I I I I I I 
3486 CCCAGTCACACCATGCAAGGAAACTAAAGGAGAAAGGTCGTGGATGCAGT 3535 

• • • • • 

3613 GAGGCCCTACCTCATGCCAAACAAAGGGTTCAAGGCTGGGTCTAGCGAGG 3662 

I I II I I I II I Mill I I II I Nil 

3536 GGGCCCTATACAGCGTCACAGTCAATGCTTCAAAGTGAGATCAATGGAGG 3585 

3663 AT GCTGAAGGAAGGGAGGT AT G 3684 

I I I I I I I I I I I I I I I I 
3586 AGACTGAAGGAAAGGACGCAGG 3607 
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Figure 19 
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Figure 20 
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Figure 23 
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